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THE GROWTH AND AGE-DISTRIBUTION OF A 
POPULATION OF INSECTS 
UNDER UNIFORM CONDITIONS 


E. J. 
Uwision of Mathematical Statistics, 
C.S.I.R.0., Canberra, Australia. 


I. INTRODUCTION 


This paper deals with the growth of populations of insects such as 
aphids that reproduce continuously under uniform conditions. Its 
object is to derive the relations among the different parameters of the 
populations: the birth rate, the death rates at different stages, the 
age-distribution, and the intrinsic rate of natural increase (Lotka 
[1945]). Interest in these relations arose from the need to estimate 
the size of aphid populations by indirect means; the work presented 
here is supplementary to that of Hughes [1961], who was responsible 
for initiating the experimental investigations and who saw the possi- 
bility of estimating the rate of growth of an aphid population by means 
of a study of its age-distribution. 

Insects pass through a number of immature stages (instars) which 
may be readily distinguished. If the average duration of each instar 
is known, a count of the numbers of insects in each provides a con- 
veniently grouped age-distribution. By studying this age-distribution 
at successive intervals we can estimate the way in which the popula- 
tion is developing and thus indirectly determine the rate of growth 
of the population. 

Although the original investigation was of an aphid population, 
the problem will be framed in terms sufficiently general to apply to 
several types of insect with similar pattern of development. 


Il. FORMULATION OF THE PROBLEM 


We shall set out in rather general terms the conditions under which 
the population is developing, so that the results may be applied to 
different types of insect. 

We consider an insect with k instars, and designate the parameters 
of the population as follows: 
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y = birth rate of adults’, 
u; = death rate during the ith instar, 
us = death rate of adults. 


For simplicity these parameters will all be assumed constant within 
the stage of growth (immature or adult) to which they apply. 

We denote by n,(t) the number in the ith instar at time ¢, and by 
n,(t) the number of adults at time t. 

We shall assume that the durations of successive instars are in- 
dependently distributed; although it could have been assumed that 
the lengths of successive instars for any insect are negatively correlated, 
so that an instar of unusually long duration will be partly compensated 
by a short one, the evidence for this assumption is not convincing. 

We shall denote by x; the duration of the 7th instar for some insect, 
and by X;, the duration of the first 7 instars, so that 


X,; = m+ ---+2,. 


Both x; and X; are random variables, whose probability densities 
we shall denote by f;(x;) and f*(X,), respectively. Under the assump- 
tion of independence, the density f'(X,) is the 7-fold convolution of 
the densities for the first 7 instars. 

We denote the corresponding distribution functions by F with 
appropriate affixes; thus 


F(z,) = f(u) du, 
and 


Xi 
F(X, = f'(w) du. 
0 
Since under uniform conditions the population will be increasing 
or decreasing at a constant rate, we shall find it convenient to express 
the population size in terms of the Laplace transforms of the various 
probability densities. We shall write 


ode) = du. 


The Laplace transforms for the convolution distributions are simply 
products of those for successive instar distributions. 
The Laplace transform plays an important part in many biological 


1This is to be distinguished from the birth rate of the population as a whole, which is customarily 
denoted by A (see Hughes [1961]). 
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problems. An interpretation of its biological applications has been 
given by Cox [1957]. 

The distribution of durations among survivors in any instar will be 
affected by the death rate. For the ith instar with death rate u; , 
the density of a duration z; will be proportional to 

and will thus be given by 
= 

We shall now consider the age-distribution, and derive thence the 
rate of growth of the population. In these populations under favourable 
conditions, the rate of increase is high and mortality has only a small 
effect. We therefore first consider how the population would grow if 
there were no mortality. 

III. POPULATION GROWTH WITH NO MORTALITY 
The chance that an insect is in the ith instar at a time ¢ after birth is 
P(X; >t > Xi) = F'(); 

likewise, the chance that it is an adult is 
P(t > Xi) = F*(d. 

Now the number of births in an interval (u, wu + du) is 
yna4(u) du + o(du), 

so the number expected in the ith instar at time ¢ is 


m() = 7 w) — w) du. 


We have in particular for the adult stage the result 


t 


nalt) = [ _ma(u) w) du. (1) 


Being of homogeneous type, the integral equation (1) has the solution 
n,(t) = na(O)e"', 
where p is the solution of the equation 


t 
e™ F*(t — u) du, 
or 


= [ e™ F'(u) du = (2) 


1 
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where we have suppressed the argument of the Laplace transforms. 
This equation corresponds to the one given by Lotka [1925], p. 115, 
equation (25). 

In equation (2) the function F*(u) represents the probability that 
an individual is a member of the adult population at a time u after 
birth. It thus corresponds to Lotka’s p(a) (the probability of survival 
at time @ after birth), although p(a) is a decreasing function of a, 
whereas F*(X) is an increasing function of X. The modes of develop- 
ment of the two populations are quite different: the individuals of 
one are subject to mortality, whereas the individuals of the other are 
subject to delay in reaching maturity. 

We can express the numbers of individuals in each instar in terms 
of the Laplace transforms with argument p. 


n.{t) 


7 -++ — ¢,). 
The total number of all stages at time ¢ is 
(3) 


The fraction of the population in the 7th instar is 


and the fraction in the adult stage is 


independently of ¢. 

This exponential solution agrees with physical considerations, and 
describes the steadily increasing population to be expected when con- 
ditions have been constant for a long period. The developnient of the 
population is described by the constant growth rate p. 

If, as a result of changing conditions, the initial age-distribution 
is different from that given by the solutions (4), the solution for the 
distribution between instars will be more complicated. However, it 
is convenient in practice to confine cbservations to steadily increasing 
populations under uniform conditions; the behaviour of such popula- 
tions is fairly easily interpreted. 

We can also find the distribution of durations in any instar (the 
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‘instar age-distribution’), which is of some theoretical interest. In 
the steadily increasing population the number of individuals entering 
the ith instar a time x previous to observation is proportional to 


The number of individuals of instar-age zx in the ith instar is the number 
which have entered a time x ago, reduced by the factor 


1 — F,(2), 


for those which have left the ith instar. Hence the probability density 
of instar-ages in the 7th instar is 


e — F,(u)] du 1 — 


(5) 


and the density of adult ages is 


pe”*. (5A) 


The expressions (4) and (5) between them completely specify the 
age-distribution in a steadily increasing population without mortality. 
If we know the age-distribution between instars, we can in principle 
equate the expressions (4) to the known probabilities to determine 
the rate of growth of the population. 


IV. EFFECT OF MORTALITY ON POPULATION GROWTH 


When the effect of mortality is taken into account, we have to 
allow for the chance that the individual will die before reaching the 
instar considered. Also, since we are assuming a different mortality 
in each developmental stage, a somewhat more elaborate analysis is 
required than was given in Section III. 

We assume that the net rate of growth of the population is p, and 
consider the movement of individuals through the ith instar. Since © 
the death rate is u; , the number of individuals entering the ith instar 
at a time xz previous to observation and surviving is proportional to 


—(ptuidz 
e 


so that, as in Section III, the density of instar-ages is 


(o + wie — 
+. Mi) (6) 


and the density of adult ages is 
(p + (6A) 
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The effect. of mortality in the ith instar is thus to replace p by 
p + u; in the distribution of instar-ages. In general it can be seen that 
the effect of mortality is to replace p by p + yu; in all functions of the 
ith instar distribution. 

The number of individuals going from instar i — 1 to instar 7 in 
an interval (t, ¢ + dt) is 


(p +. i); - i(p 1) 


Similarly the number going from instar 7 to instar i + 1 is 


+ o(dt). 


n,(t) dt + o(d), 


where the argument of the Laplace transform has been suppressed 
since it is indicated by the subscript. 
Also the number dying is 


n,(t) dip; + o(dt). 


Hence, since the rate of population growth is p, we have 


and 
+ wa) = (7A) 


Taking into account the birth rate y, we have 
+ m1) 
n,(t) 


Thus the numbers in each instar can be expressed in terms of the number 
of adults living at the same time: 


= yn, (2). 


nf) = 80. (8) 


Similarly, from (7A) we have 


malt) 


giving the equation satisfied by the growth rate: 
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which corresponds to (2). 

It is convenient to express equation (9) explicitly as an integral 
equation similar to (2). In this integral equation the densities g; of 
durations adjusted for death rate replace the original densities f; . 

The basic relation for this representation is 


ode +n) = du, 


= 


where y;(p) is the Laplace transform of the density g,(z,;). This rela- 
tion expresses the Laplace transforms appearing in (9) as the product 
of two transforms, one depending on the death rate for the instar, the 
other on the growth rate of the population. 

Thus equation (9) is explicitly 


p+ ws 


= o:(p + + + mx) 
= Vil p) ¥2(p) ¥.(p) (10) 
= [ e g*(u) du. 


Finally, equation (10) can be expressed in a manner similar to Lotka’s 
equation if we observe that, ¥4(p) = wa/(o + wa). 


The equation then becomes 


= [ eg*(u) du, (11) 


where g*(X,4) is the distribution of the total life span. Equation (2) 
clearly cannot be expressed in this way, since (11) would be meaning- 
less if there were no mortality. 

The reason why equations (10) and (11) differ from the original 
Lotka equation is that we have here assumed that the death rate is 
not directly time-dependent, but instar-dependent. Thus the rate of 
growth of the population depends on the distribution of durations of 
the different instars. When the death rate in each instar is the same 
(u; = mw), equation (10) may be reduced to 
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7 


i du, 
0 


which is comparable with (2) and with Lotka’s original equation. 


V. RESULTS FOR PARTICULAR INSTAR 

DURATION DISTRIBUTIONS 
The results of Sections III and IV are intractable unless some as- 
sumption is made about the distribution of the durations of the instars. 
We shall here consider two simple distributions which may be suitable 
approximations and may indicate the form that the results will take 
in general. We give the results when there is mortality, from which 

the results for no mortality can be easily deduced. 


(a) Constant instar length 


If the length of the ith instar is a constant x, , the results take a 
fairly simple form. 
Firstly, 
= 
The probability density of instar-ages in the 7th instar 


(x < Xi); 


= 0 > x,)- 


The growth rate of the population is defined by 


and the number in the 7th instar is 


— exp {- + xs} 
n,(t) yn a(t) h=l i 


It is often reasonable to assume that death rate is independent of 
age. When this is done, the equations reduce to a simple form, which 
is found to give results corresponding closely with observation for 
many aphid populations. 

Putting uw; = uw, we have 


n,(t) = yna(t) —— 


er { 
4 
3 
i 


DISTRIBUTIONS OF INSECTS 357 


The growth rate of the population is given by 


It is seen that, if the death rate is increased, the actual growth rate is 
decreased by the same amount. 


Mughes [1961] has worked with the assumption of constant death 
rate. He defines the sum p + u as the potential rate of natural in- 
crease of the populations. The justification for this definition is that, 


if all causes of death could be removed, the rate of growth of the popula- 
tion would in fact be p + u. 


(b) Exponential distribution 


We here assume that the duration of the ith instar is exponentially 
distributed with mean x; ; 


Wt) . 
= 


The Laplace transform here takes the simple form 


1/(1 + TXi)- 


Then the distribution of instar-ages is given by 


1 + (p Mi)X: 
Xi 


which 1s of the same form as the distribution of durations, but with 
mean 


i+ 
The growth rate of the population is given by 
p+ _ 1 
IT (1 + + 


h=1 


an algebraic equation of degree k + 1. ‘The number in the ith instar is 


= 


(e+ 
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EMPIRICAL SAMPLING ESTIMATES OF 
GENETIC CORRELATIONS 


L. D. VANVLEcK AND C. R. HENDERSON 
Cornell University, Ithaca, New York, U.S.A. 


Two methods of estimating the genetic correlation between two 
traits are commonly used. One procedure utilizes estimates of within 
subclass covariances as described by Hazel [1943]. An example is given 
by the analysis of observations on parent-offspring pairs. The other 
method uses estimates of the group variance and covariance components 
for two characters from an analysis within and between groups of 
relatives. The between and within analysis of sib or half-sib groups 
is an example of this type of ans’ *s. The sampling variances of the 
estimates obtained in either of these ways has not been widely investi- 
gated. Robertson [1959] has derived an estimate of the variance of 
the genetic correlation estimated from the variance and covariance 
analysis of groups of relatives with observations on two variables. 
His procedure deals with the special case in which both traits have the 
same heritability and in which all subclass numbers are equal. Tallis 
[1959] has given a general solution when subclass numbers are equal 
for estimates obtained from a between and within analysis of related 
groups. A general solution is, also, described by Mode and Robinson 
[1959] for genetic and genotypic correlations estimated from components 
of variance estimated from a four-way nested classification random 
model for the equal subclass numbers situation. For the parent- 
offspring method of estimation Reeve [1955] has given approximate 
_ formulae for the variance of the estimate. Apparently none of these 
procedures has been tested by empirical sampling. The purpose of - 
this paper is to describe a procedure for obtaining empirical sampling 
estimates of genetic correlations in an attempt to learn something of 
the sampling variances of the estimates obtained from the parent- 
offspring analysis. Sampling variances of the empirical sampling 
estimates are ther compared with the theoretical variances derived 
by Reeve [1955]. 


SAMPLING PROCEDURE 


Let us first consider a sampling scheme for the one-way classification 
model which may be extended easily to more complex situations. 
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Suppose the model we are sampling is 

where 

u is the underlying population mean, 

a, is a random effect associated with the ith class, and 


6,;; is a random effect associated with the jth observation 
in the ith class. 


Further assume that the a; and 6;; are NID (0, 2) and NID (0, 9), 
respectively. Then the variance of Y,; + To generate samples 
from such a population we need to draw samples from a NID (0, 1) 
population, multiply these random normal deviates by the appropriate 
standard deviations, and add to these products the population mean. 
Now Y;; = » + o,.@, + ose;; , where the a; and e;; are NID (0, 1). 
If we take Y/; = Y,; — uw, we have Y/; = o,a; + o,e;;.. The sums of 
squares normally computed in terms of our sampling model are: 


née 


a=) t=1 j=1 


e e 
=o. Dna, + + 0; » 


t=1 j;=1 


7p \2 c id 
= 


n; 


2 


and 


712 2 
nN, nN. i=] 
(The usual dot notation signifies summation over that subscript.) 
It is apparent that in order to generate a sample with given subclass 
numbers it is necessary to compute only six terms 


c ni 2 c 

which are functions of the random normal deviates and subclass num- 
bers. In order to complete the computation of the sampling sums of 
squares these functions can be multiplied by the corresponding constants 
which are ¢2 , oj , and 2c,0, and added accordingly. Thus functions 
of deviates and subclass numbers can be generated and different sets of 
parameter values used with them in order to determine the effect 
of different ratios of the population variances on the sampling estimates 
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of the variance components. This procedure can substantially reduce 
the amount of sampling time required. It should be noted that such 
estimates will be correlated, but usually this will not be a disadvantage 
and actually may be of interest. Extension of such a procedure to a 
multiple classification is apparent. For the two-way classification with 
interaction 28 terms would be generated. 

A similar procedure may be used to generate estimates of genetic 
correlations. We are actually concerned with a four-variate model 
(2 traits on the parent and 2 on the offspring). For four-tuple samples 
of size N let the four-variate sampling model be: 


Xi; = Ae 

X3i = H+ + Aces: » 


where the e’s are random normal deviates NID (0, 1) and the }’s are 
constants determined by the variance-covariance matrix of the X’s, 


Giz 


Giz G23 G3 O34 


2 
G34 


The X’s can be arranged in such a way that two of them refer to a 
pair of traits on a parent and two of them to the same pair of traits on 
its offspring. In this paper let us consider the situation described by 
Table 1. 


TABLE 1 
ASSIGNMENT OF VARIABLES 
Trait Parent Offspring 
| I X; 
Il Xs 


A@cording to our model: 
os =A 


> 
ag 
| 
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= Ary, O23 = + Aad; , 
ois = O33 = + AsAs + Avro - 


Solving for the \’s in terms of the variances and covariances we obtain: 
A =o, 
Ay = 
ds = — 
As = , 
As = (023 — — 
= 14/01 , 
As = (028 — — 
Ao = — o13014/01 — (623 — 012013/03) 
“(G24 — — — 
— (023 — — 
and 
= — — (625 — — 
— fos, — — (023 — 
(G24 — — — 
— (623 — — 


Next let us define the following quadratics of the normal deviates: 


N 2 N 
L, = Lis = — 
1 hows li N 13 N 
; N N 
i=1 N 1=1 N 
N 2 N 
€3 €2.€3. 
“oN 2 N 
2 4. 
L, MF = N Loy = 
i=l 
Ly, = eres. Ly = Dd eser = 


. 
: 
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Using these definitions and the procedure described previously, the 
empirical sampling variance and covariance estimates with (N -- 1) 
degrees of freedom are the following functions of \’s, normal deviates, 
and N: 


6 = (AiL,)/(N 1), 

62 = (AIL, + + — 1), 

= dali + — 1), 

= [Neral + + (dads + 

+ + — 1), 

+ + 0, 


| 


and 
Gas = [AsArLy + AsAghe + ApAghs + (AsAs + AsAz) Lis + (AsAo + Las 
++ (AsXo + + + AsAroLso4 + 1). 


These sample estimates may be used to construct estimates of 
genetic parameters which have known population values. The herita- 
bility of trait J may be estimated by 


hi = 2613/6165 
and of trait IJ by 
hit = 2624/6265. 


The genetic correlation is estimated from residual covariance com- 
ponents according to Hazel [1943] by four methods: 


nA = J2 = 


It should be noted that the genetic model described here assumes a 
random mating population where gene effects approximate to the four- 
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variate normal distribution. And, of course, if selection is applied to 
the population then the parameters will probably change. This model 
also does not consider effects due to linkage, sexual or cytoplasmic 
differences. 


SAMPLING RESULTS 


Samples were generated with N = 100 for the four-variate sampling 
model which has been described. Twelve hundred sets of the L’s were 
obtained with 99 degrees of freedom for each set. These 1200 sampling 
coefficients were then combined to form 120 sets of L’s with 990 degrees 
of freedom and 240 sets with 495 degrees of freedom. Then 24 sets of pa- 
rameter values (see Table 2) were used with the sampling coefficients to 
construct sample estimates of the genetic correlations (9; , g2 , gs , and g,). 
The true genetic correlation, g) , and the heritabilities of the two traits, 
h; and hj, , for each parameter set are shown in Table 2. 

The parameter values were assigned as follows: nine parameter 
sets (1-3, 7-9, and 13-15) are constructed so that the two traits have 
equal variances and heritabilities; nine other parameter sets (16-24) 
are constructed so that the variances of trait JJ are half the variances 
of trait J and the heritabilities of the two traits are equal; and the 
remaining six parameter sets (4-6 and 10-12) are constructed so that 
the two traits have equal variances but unequal heritabilities. Three 
values were chosen for heritabilities and genetic correlations, 0.2, 0.5, 
and 0.8 with the exception of sets 19 and 20 where the genetic correla- 
tions were .14 and .35. 

The genetic interpretation of parameter sets 13, 15, and 22 is not 
possible. Inadvertently, covariances used in these sets were assigned 
which do not admit explanation by the usual genetic theory in that 
the environmental covariances between traits J and II exceed the 
environmental variances of traits J and JJ. Another inconsistency 
occurs in the parameter sets 4-6 and 10-12 where the phenotypic 
correlations between traits J and JJ are different for parent and off- 
spring thus implying different environmental correlations. These 
differences were not intended but they do not invalidate comparisons 
with the theoretical variances since the derived estimates of the sam- 
pling variances depend only on a four variate distribution not necessarily 
subject to genetic interpretation. These inconsistencies should, how- 
ever, be noted. 

The means of the sample estimates are presented in Table 3. _Esti- 
mates were discarded if the denominator of the estimate had a negative 
component or if the signs of the numerator components of g, were 
different. If the signs of the numerator components of g, were both 
negative, the sign of g, was considered to be negative. 
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TABLE 2 
PaRAMETER VALUES Usep 
oF GENETIC CORRELATIONS 


Parameter 
| -20 .20 .20 | 1000 1000 1000 1000 | 100 100 20 20 100 100 
2 -20 .20 .50 | 1000 1000 1000 1000 | 100 100 50 50 100 100 
3 20 .20 .80 | 1000 1000 1000 1000 | 100 100 80 80 100 100 
4 -50 .20 .20 | 1000 1000 1000 1000 | 250 250 32 32 190 190 
5 .50 .20 .50 | 1000 1000 1000 1000 | 250 250 79 79 iU0 100 
6 .50 .20 .80 | 1000 1000 1000 1000 | 250 250 126 126 100 100 
7 .50 .50 .20 | 1000 1000 1000 1000 | 250 250 50 50 250 250 
8 .50 .50 .50 | 1000 1000 1000 1000 | 250 250 125 125 250 250 
9 -50 .50 .80 | 1000 1000 1000 1000 | 250 250 200 200 250 250 
10 50 .80 .20 | 1000 1000 1000 1000 | 250 250 63 63 ‘90 400 
11 50 .80 .50 | 1000 1000 1000 1000 | 250 250 158 158 iv0 400 
12 50 .80 .80 | 1000 1000 1000 1000 | 250 250 253 253 400 400 
| 
13 | .80 .80 .20 | 1000 1000 1000 1000 | 400 400 80 80 400 400 
14 80 .80 .50 | 1000 1000 1000 1000 | 400 400 200 200 400 400 
15 80 .80 .80 | 1000 1000 1000 1000 | 400 400 320 320 400 400 
16 -20 .20 .20 | 1000 500 1000 500 | 100 100 14 14 50 100 
17 .20 .20 .50 | 1000 500 1000 500 | 100 100 35 35 50 100 
18 .20 .20 .80 | 1000 500 1000 500 | 100 100 57 57 50 100 
19 .50 .50 .14 | 1000 500 1000 500 | 250 250 25 25 125 250 
20 .50 500 1000 500 | 250 250 62 62 125 250 
21 .50 .50 .50 | 1000 500 1000 500 | 250 250 88 88 125 250 
22 80 .80 .20 | 1000 500 1000 500 | 400 400 57 57 200 400 
23 .80 .80 .50 | 1000 500 1000 500 | 400 400 141 141 200 400 
24 .80 .80 .80 | 1000 500 1000 500 | 400 400 226 226 200 400 


“Heritability of trait I. bHeritability of trait II. ¢Genetic correlation between traits I and ITI. 


All four methods of estimation appear to provide unbiased estimates 
of the genetic correlation when the sample size is large. For small 
sample size (V = 100) and low heritability (.20) of at least one trait, 
however, the means presented in Table 3 indicate the estimates may be 
biased upwards. The bias apparently increases with an increase in the 
genetic correlation when heritability is fixed. It is also worth noting 
that the bias is larger in nearly every case for g, than for g, , gz or 93 
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TABLE 3 
MEANS OF SAMPLE EsTIMATEs oF GENETIC CORRELATIONS—ASSUCIATED 
WITH PARAMETERS IN TABLE 2.* 


1200 Samples f = 99 | 240 Samples f = 495 | 120 Samples f = 990 
Parameter 


Set fi gs fi fi gz Ys gs 


1 22 .25 .23 .17 .19 .25 .20 .16 .18 .20 
2 .65 .61 .78 56.51 .54 .57 .50 .46 .48 .50 
3 78 


4 | | 22 .18 .20 .23 22 .17..20 .23 
5 56.51 .58 52 .50 .51 .49 51 .48 .49 .48 
6 95 .89 .92 .95 841 .81 .83 .80 82 .78 .80 .78 


10 18 .15 .21 .19 .18 .19 .18 
ll 52 .50 .51 .50 .50 .49 .50 .49 .50 .49 .49 .49 
12 86 .83 .85 .81 80 .79 .80 .79 .80 .79 .79 .79 
13 18 .18 .18 .21 20 .19 .19 .19 .20 .19 .19 .19 
14 50.49 50 .49 .49 .49 50 .49 .49 .49 
15 82 .80 .81 .79 .80 .79 .80 .79 80 .79 .79 .79 
16 22 .15 .18 .12 18 .14 .16 .21 ly .14 .16 ..17 
17 59 .61 .60 .70 56 .52 .54 .56 50 .47 .49 

18 1.09 .97 1.03 1.15 94 .90 .92 .89 84 .80 .82 .79 
19 10 .06 .08 .10 .11 .12 .15 236 
21 .47 .47 .48 .47 .50 .47 .48 .48 
22 19 49 20 .20 .20 .19 
23 49.47 .49 .49 .49 .50 .49 .49 .49 
24 .80 .79 .80 .78 80 .79 .79 .79 80 .79 .79 .79 


*f is the number of degrees of freedom associated with each sample. 


when there is evidence of bias. This increased bias is probably caused 
by discarding estimates of g, when the numerator covariances differ 
in sign since in these cases g, , g2 or g3 would usually be small or negative. 

The sampling variances of- the estimates of genetic correlations were 
computed as >.”', (g;; — g;.)°/(m; — 1) where m, is the number of 


: 
7 18 .21 .20 .18 .19 .20 .20 .18 .19 .19 
8 63 .50 .52 .53 .50 .48 .49 .48 .50 .48 .49 .48 
“ys 9 .88 .86 .87 .8 .80 .79 .79 .78 .79 .78 .79 .78 
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estimates for the jth estimation method. These sampling variances 
are shown in Table 4. 

Reeve [1955] gives approximate formulae for the large sample 
variances of the estimates g, and g,. The variances are equal for the 
two estimates. This variance is (in our notation): 


2 22 2 2 2.2 2.23 2 
14 


G23 013 O24 714923 714923 F14913 


2 2 
201204 ie 201203 202034 2012034 2014023 = 4); 

014024 923913 923024 913024 913024 
where { is the degrees of freedom associated with each component 


estimated. Using the same method as Reeve the variances of g, and g» 
can be derived as 


2.2 2 2 
V _ gi {F293 103 F204 71203 52034 12034 + 1 


2 2 : 
923 4oi3 923913 023924 2013024 
and 
2 22 22 23 2 2 
V(gz) + 0103 7204 91034 1204 712034 1) 
f 4024 714024 2013024 


The large sample variances of the genetic correlations corresponding 
to the 24 parameter sets are listed in Table 5. The expected variances 
of g, and g, are about double those of g; and g,. The expected variances 
of g, and g, are slightly different for parameter sets 4-6 and 10-12 for 
which the heritabilities of the two traits are different. 

The computed sampling variances of the estimates of g, and g, are 
approximately twice those of g; and g, as expected. The agreement 
between the expected and computed sampling variances of the estimates 
is poor for the small sample size (f = 99) except for the cases in which 
the heritability of both traits is high (0.80). The differences between 
computed and expected variances are much less for f = 495. The 
variances of estimates which still deviate most from the expected are 
those associated with low heritability (0.20) of both traits. For a 
relatively large sample size (f = 990) the agreement between expected 
and computed values is surprisingly close for all combinations of param- 
eters used in this study. 

The extremely large sampling variances which occurred with some 
parameter sets are probably due to large estimates in turn due to small 
values of one or both of the denominator covariances. This suggests 
that heritability plays a dominant role in determining the sampling 
variances of genetic correlation estimates. The coefficient of variation 
for one of the denominator covariances under the normal distributior 
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TABLE 5 
I-XPECTED VARIANCES OF GENETIC CORRELATIONS 
FOR THE PARAMETERS OF TABLE 2.8 
f =°99 f = 495 f = 990 
Parameter 

Set Vigi) Vige) V(qgs) Vigi) Vige) V(gs) Vigz) V(gs) 
1 990 990 .495 | .198 .198 .099 | .099 .099 .050 
2 1.086 1.036 .540 207 «(108 .104 .104 .054 
3 1.174 1.174 .676 .200 .200 .185 .068 
390 .192 .078 .076 .038 .0389 .038 .019 
5 414 .385 6.206 .083 .041 .041 .038  .021 
6 490 .444 .272 .098 .089 .054 .049 .044 (27 
7 .148 .078 .030 .030 .016 .015 .015 .008 
.142 .142 .070 .028 .028 .014 
9 151 161 O77 .0380 .030 .015 015 .015 .008 
10 090 .089 .049 .018 .018 .010 .009 .009 .005 
11 083 080.040 .016 .008 .008 .008 .004 
12 OSS .041 018 .016 .008 .009 .008 .004 
13 054 .054 .033 .011 .007 .005 .003 
14 046 046.023 .009 .009 .005 .005 .005 .002 
15 045 .020 .009 .009 .004 .004 .004 .002 
16 974 .483 .195 .097 .097 .097 .048 
17 994 994 .199 .199 .101 099 .050 
18 1.3318 1.113 223 .223 .124 
19 147 081 .029 .029 .016 .015 .015 .008 
20 .132 132.066 .026 .026 .013 .013 
21 .059 .025 .025 .012 .013 .006 
22 .050 .050 .034 .010 .010 .007 .005 .005 .003 
23 .037 .019 .007 .004 .004 .004 .002 
24 .032 .032 .012 .006 .006 .002 .003 .003  .001 


®The expected variance of ga is the same as that of g3. 
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This average coefficient of variation turns out to be useful as a “rule of 
thumb” for determining when the theory approximation of the variance 
of the genetic correlation is likely to be good. We can note that when 
the average coefficient of variation, A, is 20 percent or less the theory 
approximations in Table 5 are quite closely in agreement with the 
empirical sampling variances summarized in Table 4. 

This guide line would apply equally as well to estimates of genetic 
correlations as to their sampling variances. When the average coefficient 
of variation is larger than 20 percent then the estimates are likely to be 
biased upward. In fact, if prior knowledge indicates the average 
coefficient to be greater than 20 percent then some thought should be 
given to whether or not the genetic correlation should be estimated. 


CONCLUSIONS 


The results given in Table 4 indicate for sample sizes of 1,000 or 
more that the approximate formulae of Reeve [1955] for the large sample 
variance of the genetic correlation are accurate. For smaller sample 
sizes (500 or less) the approximations are not accurate unless the 
heritabilities of the examined traits are high. When the sample size 
is 100 or less the approximations may tend to be very misleading. 

It seems safe to say that investigators who are estimating genetic 
correlations need at least 1,000 sets of observations in order to obtain 
reasonable estimates of the sampling variances of the estimates. Even 
then the sampling variances of the genetic correlation estimates may 
still be too large for the estimates to be of use especially if heritabilities 
of the traits are low, for example h*> < 0.20. A useful guide may be 
the average coefficient of variation for the denominator covariances 
associated with the genetic correlation which depends on heritability. 
If this coefficient is 20 percent or less then the theory approximation 
may be quite good. 

{stimation by either of procedures g; or g, seems preferable since 
the sampling variances are only about half as great as for methods 
g, or go. Procedure g; appears to be better than g, especially for small 
sample sizes since the results have indicated g, is less likely to be biased 
than g, probably because more estimates of g, will usually be discarded 
as imaginary or uninterpretable. 


SUMMARY 


A method of generating samples from a normally distributed popu 
tion is described which has the advantage of being parameter : 
Sample coefficients can be developed with zero means and unit var 
and then different sets of parameter values used with the - 
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coefficients to develop sample sums of squares and crossproducts from 
populations with different parameters. This method is used to con- 
struct sample estimates of genetic correlations for 3 sample sizes and 
24 sets of parameter values. The sampling variances of these estimates 
are compared with those expected by application of Reeve’s [1955] 
formulae. 
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METHOD OF TESTING HYPOTHESES 
AND ESTIMATING PARAMETERS 
FOR THE LOGISTIC MODEL’ 


_ JaMEs E. Grizz_e 


Department of Biostatistics, School of Public Health 
University of North Carolina, Chapel Hill, North Carolina, U.S.A. 


1. INTRODUCTION 


Data collected in many types of research often consist of the 
proportion of experimental units having a specified attribute. In 
this paper we shall be concerned with the analysis of this type of data 
when it can be arranged in a multiway classification as for example, 
a factorial arrangement. An example of this type of data is given 
in Table 1. 

Several methods may be used in analyzing these data. Regardless 
of the method, a model relating the proportion having the attribute 
to the treatments must be assumed for a meaningful interpretation of 
the experimental results. The problem of the most appropriate model 
becomes particularly acute when some of the treatments are applied 
at several levels and it is desired to investigate the nature of treat- 
ment effects with regard to both main effects and interactions. If 
the sample sizes in the cells are equal, the observed proportions are 
often analyzed by the analysis of variance. A more desirable procedure 
is to analyze the arc sine transformation of the proportion. It is well 
known that this transformation stabilizes the variance if the sample 
sizes are equal and not too small, and, for a large class of data, it pro- 
vides a unit of measurement on which treatment effects are approxi- 
mately linear except at values of the proportion near zero or one. 

In bioassay both the logit and the probit transformations have a 
long history of use. With appropriate extensions these models can be 
used in analyzing data of the type being discussed. Since the propor- 
tion responding has been found to increase sigmoidally with increasing 
stimulus for many phenomena, these transformations are particularly 
effective in providing a scale on which treatment effeets are linear. 
Dyke and Patterson [1952] gave an example of how the logit trans- 


1This research was sponsored by Cancer Chemotherapy National Service Center of the Nation:i 
Institutes of Health as part of a research contract. 
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formation can be used in analyzing a 2” factorial arrangement. Their 
method can be adapted to other arrangements by an obvious extension. 

The purpose of this paper is to present a new method of testing 
hypotheses when the logistic model is used. It is shown that estimates 
of the cell probabilities required to make certain tests can be used to 
obtain maximum likelihood (ML) estimates of the parameters in the 
assumed model with less computation than is commonly associated 
with ML estimation with models of this type. The approach used 
here was suggested by the work of Reiers¢l [1954] and Mitra [1955]. 


2. NOTATION 


To avoid multiple subscripts we will number the cells 7 = 1, --- , 7. 
It will be assumed that the relationship between the treatments and 
the probability P; of a response is described by 


1 = (log. (P;/Q,)] = A®, k<r, 


where A is an r by k non-singular matrix of known constants determined 
by the model and the design of the experiment, 6 is a k-element column 
vector of unknown parameters, 1 is a k-element column vector, and 
Q,;=1—-—P,. 

The following additional notation will be used: 


n; = sample size in the 7-th cell, 

u; = number of responses observed, 

C,. is the 7-th row vector and 

C_, is the 7-th column vector of a matrix C. 


ll 


3. THEORY 


Mitra [1955] and Diamond [1958], proceeding along the same lines 
of proof as Cramér [1945], have given the mathematical properties 
that the model and the hypotheses must have for the test statistics 
to be asymptotically distributed as x’ when the null hypothesis is true, 
and for the existence of unique consistent estimates of the parameters 
in the model for samples from multinomial distributions. Their results 
are given as general forms with the model and the hypotheses being 
unspecified except for certain analytic conditions. It is easily demon- 
strated that the logistic model and tests of lincar hypotheses have 
the required properties, Grizzle [1960]. . 


3.1 Tests of Hypotheses. 


Let it be given that the logistic model fits the data except for chance 
deviation. If we wish to test the hypothesis 
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H, :C*%@ = ¢, (3.1.1) 
where rank C* = t,t < k <r, and ¢ isa vector of preassigned constants, 
we construct a matrix C such that 

Cl = C*@ = ¢, (3.1.2) 


where C* = CA, if H, is true. Then using (3.1.2) as restraints on the 
likelihood, estimates of P; are obtained. 
The logarithm of the likelihood subject to the restraints is 


= constant + (u, log P; +», log Q,) +), (3.1.3) 


where 2 is at X 1 vector of Lagrangian multipliers. 
Taking partial derivatives, equating them to zero and solving for 
P, , we find 
Pp, = (u; — i= By (3.1.4) 


To complete the solution, 2 must be eliminated from (3.1.4). 
Let 1 be the vector of elements 7, = log, (P,/Q;), where P, is defined 
by (3.1.4). Then (3.1.2) implies 


Ci = «, (3.1.5) 
or its antilog 


can be solved for 4. Either of these sets of equations can be solved by 
the familiar Newton-Raphson method which will be given in Section 
4.1. In general, the form to use for ease of solution depends on ¢,; . 


4. COMPUTATION 
4.1 General Solution. 


Since it is not possible to derive an explicit solution for 2 from 
(3.1.5) or (3.1.6), we resort to the Newton-Raphson method of solution. 
Thus we expand (3.1.5) about some guessed value of 4, 4. say, neglect 
all derivatives higher than the first-order and solve the resulting equa- 
tion for AX = 2 — 2% .-Expanding Ci about the point % , the first-order 
approximation to (3.1.5) is given by 


Ci, —C D,C’ Ad = «, (4.1.1) 


where D is a diagonal matrix of elements 1/ (n;P,Q;), and 1 and D, 
are 1 and D evaluated at 2). Therefore, 


Ar = (C D.C’) "(Ch — 2). (4.1.2) 


| 
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If, as in many problems encountered in practice, « = O, then 
aa = (C D.C’) . (4.1.3) 


A more accurate solution can be computed by letting 2 = 2%, + Adar 
become the 2) for a second iteration. The process is repeated until 
Ad is as small as desired. Then, if Ho is true, 


x” = DLA, (4.1.4) 


where D, is taken from the last iteration, approaches the x*-distribution 
with ¢ degrees of freedom as the n; get large. 

There are two special cases for which computational formulas can 
be derived which do not involve matrix inversion. The first, the test 
with one degree of freedom, can be obtained in an obvious way from 
(4.1.2). The second is the test for homogeneity of a group of treatment 
effects. This is particularly important because many tests of inter- 
action can be put into this form. 


1.2 Computation for the Test of Homogeneity. 


For the method to be useful we must be able to write f/f, as 


6, = 6, = -+-+ = @,. In some problems this is easily done; for others 
the model may have to be reparameterized, and for some it is not 
possible. This hypothesis can also be written in the form 6, — 4, = 
6, — 0, = +++ = 6,., — 6, = Oso that C implied by the test is non- 


singular. The notation will be more clear if we renumber the /; to 


become , 2 = = 1, 


Given that the model fits the data, to test the hypothesis 


there must exist c;, such that 


tuhis = 6, 


(4.2.1) 
= 6, 
Hence, 
tishs = Cashes = 0, 
= 0, (4.2.2) 


| 
Hy: 0, = 0 = 8, 
4 
4 
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4 


376 BIOMETRICS, SEPTEMBER 1961 


if Hy is true. Using (4.2.2) as restraints on the likelihood in estimating 
the P;; , we find 


(u;; — t= 1,--- »t— 1,j= (4.2.3) 


and 
t-1 
Py = (ua / ms, j=1,---,8. 
If we let = —d, so that A; = 0, and substitute P,, 


into (4.2.1) we have 


Expanding (4.2.4) about trial values \,. of the X; , where me Aw = 0, 
we have 


6,8, ec = —_ 6A, S, (4.2.5) 
where 
T; log Sikes ’ 
_ Se 1 1 ) 
and 


= — Aro - 
Choose the i-th and k-th members of (4.2.5) and solve for 6, . Then 
Now >, 6A, = 0, and thus 
H — T,/S + = 0, 
where 
H = and 1/S = 1/8, . 

‘Therefore 

dA; = (7; — HS)/S8, . (4.2.6) 


The process is continued as in (4.1.3) until 5A; is sufficiently small. 
Then 


= > 
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where S; , taken from the last iteration, is asymptotically distributed as 
x with ¢ — 1 degrees of freedom if // is true. 

If we change (4.2.4) to a product by taking the antalog of the equa- 
tions, it represents a generalization of Norton's [1945] extension of 
Bartlett’s [1935] test. The solution to the product form of (4.2.4) 
is given by 

oA; (R; 1/Sh)/R,S, (4.2.7) 
where 
S,; and 1/8 are as previously defined and h = >>, 1/(R,S,). For small 
values of ¢;; , (4.2.7) may be a more convenient computational form 
than (4.2.6) because tables do not have to be consulted. Also in the 
case of (4.1.3) with one degree of freedom the product form may be 
more convenient. The solution is 
= (R, — Ree’)/(R,S, + RS.c’), (4.2.8) 
where 


R, 


Il (u; — R, = Il + 


i 


S, = — and = + 


5. TESTING OF THE FIT OF THE MODEL 
AND FITTING THE MODEL 
Even though the significance level and the operating characteristics 
of the tests are not what they are presumed to be, most investigators 
test the agreement of the model with the data before proceeding to 
make tests on the parameters in the model. The procedures described 
in Sections 3 and 4 can be adapted for this purpose. 


5.1 Test of the Fit of the Model. 


The method can be made intuitively plausible by drawing an analogy 
with the analysis of variance. Recall that the residual or error sum of 
squares in the analysis of variance for a factorial experiment can be 
regarded as the sum of squares due to high-order interactions. Uence 
this sum of squares can be computed as. the sum of squares «sociated 
with a set of contrasts. To obtain the proper test statistic choose C 
as the set of contrasts which would be interpreted as interaction among 
effects included in the model and interpreted as error, and then estimate 
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the P; subject to the restraint Cl = O. This is equivalent to requiring 
that there be no deviation from the model, which is the hypothesis 
one wants to test. Once C is ascertained, the estimation and testing 
proceed as previously outlined. The 2 computed in the process of 
making this test can be utilized to compute the ML estimate of @ with- 
out the iteration required by conventional methods. 


5.2 Estimation of Parameters. 


To ascertain the relationship between 2 and the observed logit, 
let us expand |; about the point P,. , the value of P; if the data fit 
the model. Then, to a first-order approximation 


log. (P:/Q:) = log. (Pio/Qio) + (Pi — Pin) /(P 


Now ¥C.; = P; — Pio . Solving for log. (Pio/Qio) and replacing 
P; and Q; by their estimates, u;/n; and v,/n; , we find 


log. (Pio/Qio) = log. (ui/v,) — 


where 
P,, = (u; 


Therefore we see that log. (P;./Q,o) taken from the test for the fit 
of the model represents the working logit. 

Thus if we obtain estimates of the P; subject to a suitably chosen 
set of restraints we can use them in the equation, 


A’WA®6 = A’WI*, 


where W is a diagonal matrix of elements n;P Qio and 1* has elements 
log. (Pio/Qio), to estimate ® without the iteration ordinarily associated 
with the solution of ML equations of this type. 

Although iteration is not required to compute 6 by this nies. 
it is required for computing 2. By conventional methods a k X k 
matrix must be inverted for each iteration, but by this method a 
(r — k) X (r — k) matrix is inverted in each iteration in computing 
2 and a k X k matrix is inverted once in computing 6. Since usually 
r — k < kin problems of the type envisioned here, there is some saving 
in computation time through the use of this method. Furthermore 
for some experiments, 2 can be obtained by methods given in Section 
4.2 without inverting matrices. Or, as in the example that follows, 
we may need to compute a in the process of making the tests. 


5.3 Example. 
Cochran [1954] gives the following data: 
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TABLE 1 | 
Data ON NuMBER OF MOTHERS WITH PREVIOUS INFANT LossEs 
No. of Mothers with 
Birth Order Losses No Losses Total 
Problems 20 82 102 
2 Controls 10 54 64 
Problems 26 . 41 67 
3-4 Controls 16 30 ' 46 
Problems 27 22 | 49 
5+ Controls 14 23 37 


It is desired to compare the mothers of Baltimore school children 
who have been referred by their teachers as presenting behavioral 
problems to mothers of a comparable group of control children. For 
each mother it is recorded whether she had suffered any infant losses 
previous to the child in the study. 

The model assumed is 


lj=utats,, DB, =0, 


where a; is the effect of problem or control depending on whether 
i = 1 or 2, and 8; is the effect of the j-th birth order, j = 1, 2, 3. 

lirst we will test the assumption that the model proposed fits the 
data. This may also be regarded as the test of the hypothesis of no 
aB-interaction. If there is no interaction 


Li — Le = la — be = Ia — ae (5.3.1) 

A convenient form of (5.3.1) to use as restraints in estimating the P;; is 
Li — le — Ia + le = 0, 

— le — Ia + lee = 0. 


(5.3.2) 


The estimates are: 


P,, = (20 — 2,)/102, P., = (26 — d2)/67, Py, = (27 + A, + A2)/49, 
= (10 + d,)/64 = (16 + /46, = (14 d.)/37. 


To complete the solution, we will use the antilog form. 
Let + Ax = 80 that 4, + A» + = 0. Then the equations 
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82+ A/\10+A,/  \41 + A/\16 + A/ \22 + + 
must be solved for A, , Az , As Subject to the restriction A, + A, + A; = 0. 

The starting values of \,. are determined as follows: The objective 
of the iteration can be regarded as to make R, = R, = R; , since when 
this occurs all 64; = 0. Therefore for the first trial values only, R, , R. 
and R;, need to be computed. The initial values of Ayo = Av = Aso = O 
can be used if there are no zeros in the success and failure classifications. 
If for this test on this set of data we start with all A, =-0, five iterations 
are required for accuracy comparable to the solution given. Needless 
to say, some practice in choosing \;. can save computing time. 


After some preliminary examination of the type described, the 
values of 


Aro = —.5, = —1.0, Azo == 1.5 


were chosen for use in the complete cycles of iteration. After the 
computation shown in Table 2 we find 


A, = —.503, A. = —1.213, A, = 1.716. 
TABLE 2 
CoMPUTATION FOR TEST OF INTERACTION 

Iteration 1 Iteration 2 
1.443009 1.443009 
1.395000 1.443979 
Rs 1.505148 1.444384 
Si . 184662 . 184662 
S2 . 160962 . 161549 
Ss . 192797 . 192333 
3.752782 3.752782 
1/(RS2) 4.453510 4.286823 
1/(R3S;) 3.446054 3.599673 
1/Sh 1.443038 1.443789 

.000 -- 003 

dr2 — .214 

.214 


Of course this is only the approximate solution. Computation was 
stopped at the second interation because 6A; were trivial, indicating 
that further iteration would have very little effect on the test statistic. 
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Then 
X* = NS, + NS, + 
= 


For a x*-test with two degrees of freedom and .05-significance level, 
the critical region is given by X’? > 5.991. Therefore we do not reject 
the hypothesis of no interaction. 

Now that we have some assurance. that the model fits the data, 
we can proceed to test hypotheses about the parameters in the model. 
The objective of the study is to ascertain whether there is a difference 
between problems and controls. This may be stated as 


Ap = 0. 
The restraint associated with this test. is 
Lit + — bis — ly, = 0, 


which gives the equation 


= - a) (25 — a) (27 = (8 AY (16 a) (is + 
82 + A/\41 + A/\22 + \54 — A/\B0 — — 

As a preliminary estimate of A, choose A, = 2. After two iterations we 
find X = 2.188 and X* = 2.475. Therefore if we put the significance 
level at .05 we do not reject H, . It is interesting to note that the 
probability of observing an X® > 2.475 is approximately 12 percent 
if /7, is true while the test Cochran [1954] suggests has probability of 
approximately 10 percent. 

Another hypothesis of interest is 


= = Bs, 


that is, equality of birth-order effects. To test this hypothesis we use 
the restraints 


which can be written 


t+ = bs, = = 0, 


Using the same technique as in the test for interaction, we obtain the 
equations 


82+ +7 + + 22 + + 
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Starting with A1o = —9, Axo = 5, Aso = 4, after three iterations we 
find that 


A. = —9.925, A. = 3.529, As = 6.396, 


and X’ with two degrees of freedom is 24.239. Therefore we reject H, . 

Now that there is evidence that birth order has a significant effect 
it will be instructive, as an example of the technique for testing a single 
contrast, to ascertain the nature of the effect. For simplicity let us 
assume that the birth orders 2, 3 -- 4, 5+ are approximately equally 


spaced. Two hypotheses of interest are: is the effect of birth order 


linear or does it require a second-degree polynomial equation to de- 
scribe the effect? 
This can be ascertained by testing 


Ho : 8B: — = 0, 

and 

Ho: : Bi — 282 + Bs = 0. 
The restrictions for estimating the P;; are | 

Li + le — la — Lee = 0 
and 

hi + he — 2h: — Qbe + la + lee = 0 

for testing Ho, and Ho. respectively. For the test of Ho, , the equation is 


82 + /\54 + — 
Starting with \. = —7 in three iterations we find \ to be —7.535. 


The computations as given by (4.2.8) are shown in the following 
table. 


TABLE 3 
CoMPUTATION FOR TEST OF A SINGLE CONTRAST 
Iteration 1 Iteration 2 Iteration 3 
130213 . 139294 139544 
160919 . 140070 139537 
Si 130470 . 128352 128297 
S: 260673 . 272320 272661 
— .521 —.014 000 


ag 
a 


NEW METHOD FOR THE LOGISTIC MODEL 383 


X? = + S,) = 22.765. | 
This is far beyond the .05-point for x’ with one degree of freedom. 
Therefore we reject Ho, - 
For Ho. the equation is 


82 + d/\54 + r/\22 + A/\23 + — \30 — 
Starting with A, = —1, after two iterations we find \ to be — 1.191 and 
X? = 1.478. Therefore we do not reject Ho. and we conclude that 


within the range covered by the data, birth-order effects are linear on 
the logistic scale. 


For the data presented here for illustrative purposes, it is not par- 
ticularly helpful to estimate the parameters in the model. However, 
we will proceed with the estimation to illustrate the non-iterative 
procedure for estimating parameters. The P,; obtained under the 
no-interaction hypothesis are 

P,, = 2010, Ps, = 4062, = 5160, 

= 1483, 3215, Ps, = 4248, 
and the associated /*; are 

is = — 1.380, = — .380, = .064, 

= —1.750, = —.747, = —.302. 

The model can be reparameterized in several ways to make the 
equations non-singular. One way is to let a = a, — a1, m = Bi — fo, 
n2 = B, — B;, then 


1 1-1 
1-1 O -1 
@ =k 
and the normal equations are 
71.93772 —17.62000 — 1.72980 3.18720 || 52.3495 
—17.62000 71.93772 — 2.17140 — 5.10068 |_|. 3.5944 
1.72980 — 2.17140 50.65988 24.4604 |», | | —23.1122| 
3.18720 — 5.10068 24.46504 45.74288| | —34.8062] 
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The solution is 
= —.749, a= —.184, m = —.185, — .630, 
and the variances of the estimates are .0149, .0149, .0267 and .0298 
respectively. The predicted logits using these estimates are 
= —1.381, Is = —.380, 064, 
% = —1.748, is = —.748, L$ = —.303, 
which are very close to those used as the original estimates. Unless 


more then three-place accuracy is desired no further iteration is 
necessary. 


Il 


6. SUMMARY AND DISCUSSION 


In this paper we have developed new methods for making tests on 
the parameters in the logistic model. or many types of data the 
methods developed here allow computation of X? for tests of hypotheses 
about some subset of the parameters in the model without having to 
estimate them all as is usually done when using the logistic model. 
This might be particularly useful in combining 2 X 2 tables, as in the 
test of problems versus controls in the example given, which can be 
regarded as three 2 X 2 tables. If all that we desire to do is to test 
the difference between problems and controls, we need to do only that 
part of the work done for testing a, — a, = 0 in the example, if we 
can assume that the logistic model fits the data. 

Once 2 associated with a properly chosen C-matrix has been de- 
termined, it can be used to compute the ML estimate of the param- 
eters in the model without further iteration. The relationship between 
4 and the working logit and between the logistic model and Bartlett's 
test of interaction is shown. 
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A GENERALIZED MODEL OF A HOST-PATHOGEN SYSTEM 


C. J. Move 
Department of Mathematics, Montana State College 
Bozeman, Montana, U.S.A. 


INTRODUCTION 


This paper is a sequel to a paper, “A Model of a Host-pathogen 
System with Particular Reference to the Rusts of Cereals”, which 
appeared in Biometrical Genetics [1960]. The former paper was re- 
stricted to the summer stage of the rusts of cereals, but in this paper 
a wider class of host-pathogen systems will be considered and some 
assumptions will be relaxed. Specifically, the assumptions of random 
association of the host and pathogen, equal number of varieties of the 
host and races of the pathogen, and constant fitness functions will be 
dropped. 

The results of this paper are intended to apply to host-pathogen 
systems satisfying the following conditions: 

1. The pathogen reproduces on the host. 

2. The host may be differentiated into varieties on the basis of 
its resistance to the races of the pathogen. 

3. The pathogen may be differentiated into races on the basis of 
its ability to grow on a set of host varieties. 

4. Host resistance to a particular race of the pathogen is genetically 
controlled. 

5. The damage to the host caused by the pathogen in a given time 
interval is directly related to the increase in number in the pathogen 
population during the given time interval. 

It should be pointed out that conditions (1) and (4) imply that 
no assumptions are made with respect to the mode of reproduction of 
the pathogen and the mode of inheritance of host resistance to the 
pathogen. 

Many economic crop plants and their foliar diseases caused by 
pathogenic fungi are examples of host-pathogen systems satisfying the 
above conditions. These host-pathogen systems are characterized by 
frequent shifts in the racial frequencies of the pathogen population, 
making it difficult to maintain host resistance to the pathogen. It 
seems plausible that damage to the host in such host-pathogen systems 
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could be adequately controlled if (1) the racial structure (the relative 
frequencies of the races)-in the pathogen population were stabilized 
and (2) the increases in number in the pathogen population was held 
below some critical number during the growing season of the crop. 

In a previous paper, Mode [1960], the writer was unable to find 
conditions under which the racial structure of the pathogen popula- 
tion was stabilized. The purpose of this paper is two-fold, namely, (1) 
to characterize a host-pathogen system containing an arbitrary number 
of varieties of the host and an arbitrary number of races of the pathogen 
and (2) to find some conditions under which condition (1) of the previous 
paragraph is met. It will be assumed that the host-pathogen systems 
under consideration may be characterized in terms of continuous and 
differentiable functions of time. 

In passing it is interesting to note that in at least one case, the 
genetics of both the host and the pathogen has been worked out. The 
reader is referred to the remarkable paper of Flor [1956] in which the 
complementary genetic systems of flax and flax rust are discussed. 
As an example of experimental work in the field under consideration, 
the reader is referred to the recent paper of Suneson [1960]. 


1. POPULATION NUMBERS AND THE ASSOCIATION 
OF THE HOST AND PATHOGEN 


Let H be the number of members of the host population and let 
H, be the number of the members belonging to variety V,(i = 1, --- , m) 
at time ¢. Similarly, let P be the number of members of the pathogen 
population and P; the number of these members belonging to race 


R,(j = 1, ---,n) at timet. The relations of the H; and P; to H and P 
are given by 


H = > A; and P= Pi. (1.1) 
t=1 i=l 
Let a member h of the host population and a member p of the 
pathogen population be chosen at random. The probability that h 
belongs to variety V; is 


Pr (he V,) = H/H (1.2) 
and the probability that p belongs to race R; is 
Pr (pe R;) = = y, - (1.3) 


We shall consider next the association of the host and pathogen. 
Let ¢,; be the probability that at time ¢ a member h of the host popula- 
tion belonging to variety V, and a member p of the pathogen population 
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belonging to race /?, are associated. In symbols we may write 
Pr (h Y; ,peR,) (1.4) 


If the host and pathogen are associated at random, then we have 
independence in the probability setise so that 


= (1.5) 
for all and 

In general, however, we would expect non-random association of the 
host and pathogen to be the rule due to the nature of. the specific re- 
action of the host and pathogen to each other. In order to take non- 
randomness into account we introduce a measure of departure from 
random association, @,; . The measure of departure from random 
association is a positive number satisfying the relation 


Kimura [1958] introduced this measure in connection with the study 
of nonrandom mating diploid populations. We note that the measure 
6,; is related to certain conditional probabilities. The conditional 
probability that a member p of the pathogen population belongs to 
race R; given that it is associated with a member h of the host popula- 
tion belonging to variety V, is 


Pr(peR, | he Vi) = = - (1.7) 


Similarly, the conditional probability that a member h of the host 
population belongs to variety V, given that it is associated with a 
member p of the pathogen population belonging to race R; is 


Pr (he V; | peR,) = = . (1.8) 
Since 2,6,; and y,0;; are conditional probabilities, it follows that 
> 2:6; = = 1. (1.9) 


2. THE FITNESS FUNCTIONS 

Due to the specific nature of the reaction of the host and pathogen 
to each other, with each association 7j of the host and pathogen we 
shall associate two fitness functions, one for the host and one for the 
pathogen. The fitness functions may be regarded as measures of the 
ability of the host and pathogen to reproduce in a given association. 
Let \,; be the fitness function’of the host and u,; the fitness function 
of the pathogen in the zj-th association. Since the ability of the host 
and pathogen to reproduce in a given association may depend on the 
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wv, and y; , we shall allow \,; and y,; to be real-valued functions of the 
real variables x, and y,.. Clearly, for every g,;; we associate a \,;; and 
u,; so that the pair (A,; , w.;) may be regarded ‘as a random variable 
with respect. to the ¢,; . 


The expected vatue of \;; or the mean fitness of the host population is 


and the expected value of u,; or the mean fitness of the pathogen popula- 
tion is 


The mean fitness of host variety V,; , or the conditional expectation 
of A,; given V;, , is 


and the mean fitness of race R;*of the pathogen, or the conditional 
expectation of w,;,; given R; , is 


E(w | = = (2.4) 


The conditional expectations of \;; given R; and y,,; given V;, 
which are also of interest, are given by 


E( ! R;) 


(2.5) 


n 


i=l 


Bu | 


We now make the following definitions of the measures of fitness 


of variety V; and race R;.. The measure of fitness of variety V; of the 


host population is defined by the differential equation, 


d(log H,)/dt = d;. , (2.6) 
and the measure of fitness of R ; of the pathogen population is defined by 
d(log P;)/dt = u.; . (2.7) 


These definitions of the measures of fitness of variety V; and R; 
are equivalent to those given in the previous paper. 

From the definitions of the measure of fitness of variety y, and 
race R; , it may be shown that the change in the probabilities x;and y; 
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in time is characterized by the set of differential equations 
dz;/dt = = r..); = m), 
dy;/dt = M..), Gj = 


Finally, by differentiating H and P with respect to time, we may 
show that the changes in population number in the host and pathogen 
populations are given by 

d(log H)/dt=>.., d(logP)/dt= (2.9) 


3. THE VARIATION AND COVARIATION IN FITNESS 
IN A HOST-PATHOGEN SYSTEM 


We continue our characterization of the host-pathogen system by 
defining certain variances and covariances in fitness. 
The total variance in fitness in the host population is 


var (A) = E(A — r..)? = = r..)’, (3.1) 


(2.8) 


the total variance in fitness in the pathogen population is 


var (u) = E(u — = — (3.2) 
and the total covariance in fitness in the host-pathogen system is 


cov (A, = EA — — = — — (8.3) 


In addition to the variance and covariance in fitness, we may also 
define certain components of variance and covariance which are useful 
in characterizing a host-pathogen system. 

The variance in fitness in the host population attributable to varieties 
is 


var (A; V) = Do (3.4) 
the variance in fitness attributable to races is 
var (A; R) = —2..)’, (3.5) 


and the variance in fitness in the host population attributable to the 
interaction of varieties and races is 


var (Q; VR) = Deis — 
By continuing in the same way we may define analogous components 


of variance in fitness for the pathogen population.. Thus we shall let 
var (u; V), var (u; R), and var (u; VR) stand for the components of 
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variance in fitness in the pathogen population attributable to varieties, 
races, and the interaction of varieties and races, respectively. Similarly, 
we shall let cov (A, »; V), cov (A, #; R), and cov (A, u; VR) represent 
the components of covariance in fitness attributable to varieties, races, 
and the interaction of varieties and races respectively. These com- 
ponents of covariance are, of course, defined in the obvious way. 

In particular, if the host and pathogen are associated at random 
so that @;; = 1 for all 7 and j, then the following relations among the 
total variance and covariance and. the components of variance and 
covariance hold. ; 


var (A) = var (A; V) + var (A; R) + var (A; VR), 
var (u) = var (u; V) + var (u; R) + var (u; VR), (3.7) 
cov (A, uw) = cov (A, uw; V) + cov (A, wu; R) + cov (A, w; VR). 


Relations (3.7) will not hold in general, however, if the association 
of the host and pathogen is nonrandom. 


4. THE CHANGE OF ).. , u.. , VAR(A), VAR(z), 
AND COV(A,uz) IN TIME 


We complete our characterization on the host-pathogen system by 
finding differential equations characterizing the change in \.. , u.., 
var (A), var (u), and cov (A, uw) in time. The proofs of the results of 
this section are easily obtained by straight-forward differentiation of 
the functions in question with respect to time and by using the results 
and definitions of the preceding sections. : 

When one wishes to find these differential equations, certain variables 
arise. A considerable simplification in representation may be gained 
if we set = d(log 6,;)/dt, g;; = d(log ¢:i)/dt, 4;; = dd,;/dt, and 
= du;;/dt. The relations 6;; = and = 9;;¢,; are also 
useful. Note, with each 9;; we may associate the four-tuple, (¢,; , 
6;; , \si , Haj), SO that the four-tuple may be regarded as a random 
variable with respect to ¢;; . 

With the above conventions, the change in mean fitness in the 
host population becomes 


dy, ./dt = var (A; V) + cov (A, R) + + E(A), (4.1) 
and the change in mean fitness in the pathogen population becomes 
du,./dt = var (u; R) + cov (A, V) + E(u) + Ela). (4.2) 


Thus, the change in the mean fitness in time in the host population 
partitions into a variance component in the host population attributable 
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to varieties, a component of covariance attributable to races, a term 
attributable to the changes in the measures of departure from random 
association, and a term attributable to changes in the fitness functions 
in time. The equation characterizing the change of mean fitness in 
time in the pathogen population is similar. If all 6,; , \;; , and u,, 
are constant, then equations (4.1) and (4.2) reduce to the results given 
in the previous paper. 

The differential equation characterizing the change in time of the 
total variance in fitness in the host population is 


d[var (A)]/dt = Elg(d — + 2 cov (A, (4.3) 
and the differential equation giving the change in time of the total 
variance in fitness in the pathogen population is 

d{var (u)]/dt = E[g(u — u..)*] + 2 cov (u, a). (4.4) 


Finally, the differential equation characterizing the change in the 
total covariance in fitness in time is 


d[cov (A, dt = Efg(A — X..)(u — + cov (A, 
+ cov (A, #). (4.5) 

Tt will be noted that in equations (4.1) through (4.5) we have set 
EQ) = EO) = 


(4.6) 


E(6u), E(u), — w..)"], and — — #..)] are of course 
defined in a similar way. It is instructive to study the form of equations 
(4.3), (4.4), and (4.5) according as all 6,; , A,; and y,; are constant or 
nonconstant. 


5. STATIONARY STATE SYSTEMS 


We shall say a host-pathogen system is in a stationary state if all 
¢,; cease to change with time. Our discussion of stationary state 
systems begins by considering the case when all 6,; , \;; , and y,; are 
constants. We note from equation (1.9) that, if all @,; are constant 
and the system is in a nonstationary state, then the x, and y, must 
change in such a way that the equation 


= 1 (5.1) 


is satisfied. 


on 
= 
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Let A = fa,,| = [6,;A,,] and B = [b,,] = [6;,u,,], be constant m n 
matrices, and let = (#,,--- , 4,) and y’ = , , g,) be vectors 
of stationary state probabilities. In addition, let.4,. and f.. be the 
values of A. and w.. corresponding to the stationary state vectors 
Xandy. inally, let c, and c, represent m X 1 and n X 1 column vectors 
consisting of all \..’s and f..’s respectively, ie. cf = (4.0, , 
and ch = (f@.. , --- , @.).. Throughout the remainder of the paper a 
prime will be used to denote the transpose of a matrix or vector. 

Irom the definition of a stationary state system, it follows that 
four classes of stationary states may exist; namely, (1) when popula- 
tion number in both the host and pathogen populations is constant, (2) 
when population number in the host population is constant but that 
in the pathogen population is variable, (8) when population number 
in the host population is variable but that in the pathogen. population 
is constant, and (4) when population number in both the host and 
pathogen population is variable. Henceforth we shall refer to the 
four classes of stationary state systems as systems of Class I, II, IL, 
and IV, respectively. 

If a host-pathogen system belongs to Class I, for example, then the 
defining equations of the class are 


d(log H)/dt 
d(log P)/dt 


0, 


(5.2) 


= 0. 


Moreover, if the system is in a stationary state, then the set of 
differential equations 
(9 = ,n), 


must be satisfied. We shall refer to equations of the form (5.2) as 
equations of a stationary state. And, if the stationary state is non- 
trivial, ie. all 2, and y, are not zero, then the equations 


A;, = 0 (¢= 1,--- ,m), (5.4) 


II 


must be satisfied. 
By writing equations (5.4) out in full, we see that the stationary 
state vectors ¥ and ¥ must satisfy the algebraic equations, 


Bx=0,, Av=0., (5.5) 


where 0, and 0, are n X Land m X 1 zero vectors respectively. By 
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continuing in this way, we may find the defining equations, the equa- 
tions of a stationary state, and a set of equivalent algebraic equations of 
the three remaining stationary state systems. The results are given 
in Table 1. 

For the case of the matrices A and B nonconstant, we may again 
write down the defining equations of a class, the equations of a stationary 
state, and the algebraic equations of a stationary state. We shall, 
however, restrict our considerations to that class of fitness functions 
which are such that the algebraic equations of a stationary state admit 
at least one nontrivial solution. 


6. SOLUTIONS OF THE STATIONARY STATE EQUATIONS 


In this section we shall state some conditions under which solutions 
of the stationary state equations exist for the case A and B are constant 
matrices. For a detailed treatment of the methods used in this section 
the reader is referred to a book on matrix algebra such as Perlis [1952]. 
We shall say a stationary state is unique if, and only if, the equations 
of a stationary state admit a unique solution. The four classes of 
systems will be considered in order. 

The algebraic equations for a stationary state system of Class I are 


= 0,, (6.1) 
Aj = 0.. (6.2) 


Note that (6.1) is a set of n linear equations in m unknowns and 
(6.2) is a set of m linear equations in n unknowns. Let min (m, n) be 
the minimum of m and n. If the rank of B is r, and that of A is r. , 
then neither r, nor r. can exceed min (m, n). Moreover, equations 
(6.1) and (6.2) admit a nontrivial solution if, and only if, the relations 
r,; < mandr, < nare satisfied. 

Let 8, be the set of all % satisfying equation (6.1) and let S, be the 
set of all satisfying equation (6.2). The sets S, and 8, are vector 
spaces. Thus, any multiple of a vector or linear combination of vectors 
belonging to the set is again a member of the set. 

If the rank of B is r, < m, then equation (6.1) may be reduced by 
elementary row operations to the form 


= 0, (6.3) 
0; 


where I,, is an identity matrix of order r, , B, isa r, X (m — r,) matrix 
of constants, and 0, isa n — r, X m zero matrix. The row vectors of 


— 
[ 
r |B, 
> 
| 


M 


4 


A HOST-PATHOGEN SYSTE 


“gq Jo 943 ge 


= $V 


70 = $v 


= fv 
'9 = 


0 = — = pp/thp 


ere 


0 = — = 


= (1 — = 


0= = 1p/tfip 
0= = 


293819 B JO 


Ox X= H Ox X=H tan 
AI III II I 


SWALSAG ALVLG AUVNOILVLG dO 


4 
395 
| 
| 
| 


396 BIOMETRICS, SEPTEMBER 1961 


the (m — r,) X m matrix 
(By: (6.4) 


are solutions of equation 6.1 and form a basis for the vector space §, . 


Similarly, if the rank of A is r, < n, then (6.2) may be reduced to 
the form 


__|__lg =0, (6.5) 


where I,, is an identity matrix of order r, , A, is ar, X (nm — rz) matrix 
of constants, and 0, is a (m — r.) X n zero matrix. The row vectors 
of the (n — r2.) X n matrix 


(Ai: -I,_,.) (6.6) 
are solutions of equations (6.2) and form a basis for the vector space 8» . 
Any pair of vectors (%, §) where % belongs to S, and ¥ belongs to S, 
with coordinates satisfying the conditions, 0 < #, < 1,0 < 9; < 1. 
>: 4; = 1, and >; 9; = 1, is a solution of equations (6.1) and (6.2) 
and are, therefore, probability vectors of a stationary state. Clearly, 
the pair (X, ¥) is not unique so that there exists no unique stationary 
state for systems of Class I. 


For systems of Class II the algebraic equation of a stationary 
state are 


B’k = ¢, , (6.7) 
Ay = 0. . (6.8) 


Equation (6.8) may be solved by the methods discussed in Class I 
systems but (6.7) needs special consideration. Writing (6.7) component- 
wise we have 


where 


For each probability vector ¥ which is a solution of (6.8), we may 
find a f@;. which is independent of %, and for each f;. corresponding 
to a ¥ we have the system of homogeneous equations 
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It is clear that the methods discussed in Class I systems may again 
be applied to find solutions of equations (6.10). We note that there 
exists no unique stationary state for Class IT systems: since there is no 
unique solution of equations (6.7) and (6.8). It is clear that the methods 
used to find stationary state solutions for Class II systems may again 
be applied to Class IIIT systems. 

lor systems of class TV the algebraic equations of a stationary 
state are 


Bi=c., (6.11) 


Af=c,. (6.12) 


Equations (6.11) and (6.12) cannot be solved by the methods used 
heretofore since they are nonlinear in the components of the unknown 
vectors X and y. If at least one of the vectors c, and c, is constant, 
then we may use the methods of solving linear equations to find solu- 
tions of (6.11) and (6.12). 

lor example, if the vector c. is constant, then equation (6.11) admits 
a solution if, and only if, the rank of the matrix B’, satisfies the relation 
r, < mand the rank of the augmented matrix (B’,-c,) isr,. Ifr, < m, 
then (6.11) admits infinitely many solutions. If 7, = m, then (6.11) 
admits a unique solution which may be found by Cramer’s rule. In 
all cases we may place conditions on the elements of the matrix B’ 
so that at least one probability vector is a solution of (6.11). 

lor cach probability vector % which is a solution of (6.11) it may 
he shown by the methods used in the discussion of Class IL systems 
that (6.12) becomes a homogeneous equation in ¥. In general, homo- 
geneous equations do not admit unique nontrivial solutions so that a 
unique stationary state cannot exist in this case. The cases in which 
¢, ix constant and both ¢, and c, are constant may be treated similarly. 


It is of interest to note that if (1) the vectors X, , --- , %, are solu-- 


tions of (6.11), (2) the vectors J, , «+: , ¥ , are solutions of (6.12), and 
and (3) a, (¢ = 1, --+ , &) are positive numbers which satisfy the con- 
dition, a, + +a, = 1, then a,%, + + a,%,anday, + --- + a9, 
are also solutions of (6.11) and (6.12) respectively. 

The case in which.m = n, A and B are nonsingular, and the host 
and pathogen are associated at random is covered in the following 
interesting theorem: 


Theorem 6.1: If (1) m = n, (2) the matrices A and B are non- 
singular, and (3) all 6;; = 1, then &, ¥, 4.. , and @.. are unique so that 
v unique stationary state may exist in svstems of Class TV. 
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Proof: Let and be the values of and corresponding to 
the stationary state vectors and . If % and # are stationary- 
state vectors, then their components, by Cramer's rule, must 
satisfy the equations. 


= |B; |/|B| (@=1,--+,m), 
9, =4.. | A; Al (j= 1,-+-,m), 
where B, is matrix obtained from B by replacing the i-th row by a 
row of ones, A; is a matrix obtained from A by replacing the j-th column 
by a column of ones, and | B; |, | A; |, | B |, and | A| are the determinants 


of the matrices in question. Summing equations (6.13) over 7 and j 
and using the condition the components of t and § sum to one yields 


(6.13) 


(6.14) 
2.. = IBI/2 |B. I. 
By substituting and in (6.13) we find 
4, = 1B 1B. I, 

(6.15) 


1A 


To prove uniqueness, let \* and y»* be the values of \.. and u.. 
corresponding to the stationary state vectors x* and y*. Proceeding 
as before, we have by Cramer’s rule 


xt = |B; |/| Bl, 
yt = | A; |/| Al. 


By summing over 7 and j and using the condition the components of 
x* and y* sum to one we findi,. = A* anda, = u*, and by substituting 
i... and g.. in (6.16) we reach the conclusion that = x* and # = y* 
which proves the theorem. 

It will be noted that in order that % and § be probability vectors 
we must require that the elements of the matrices A and B be such that 
the components of % and f be non-negative. It will also be noted that 
by dropping the assumptions of equal numbers of varieties and races 
and random association of the host and pathogen, we are led to the 
possible existence of non-unique stationary states. 


(6.16) 
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7. THE STABILITY OF A STATIONARY STATE | 


We next turn to the question of stability of a stationary state 
corresponding to a pair of stationary state vectors (X, f). For brevity, 
we will simply refer to the stability of a pair (%, 7). 

Let z’ = (x’, y’) bea l X (m+ n) vector and rename the components 
2a(k =1,---,m-+n). Let dz/dt be a column vector with components 
dz,/dt, and let u(z) be a column vector with components u,(z) = 
— A.) for k = 1, , mand u(z) = — w..) for 
k= m+1,---,m-+n. with these definitions, differential equations 
(2.8) may be written in the compact form 


dz/dt = u(z). 


In this paper the definition of stability given by Bellman [1953, 
p. 76] will be used. Using this definition, the stability of a stationary 
vector 2 may be decided by using the following procedure. Let 


h’ = z’ — 2’ = (hi, +++ , Aman)» Now by the multivariate version of 
Taylor's theorem, we may express u(z) in the form 
u(Zz) = u(Z@) + Qh + vih), (7.2) 


where Q is the (m + n) X (m+ n) Jacobian matrix of u(z) evaluated 
at Z and v(h) stands for a vector of nonlinear terms. Clearly, 
dh/dt = dz/dt since z isa constant vector. Moreover, if Z is a stationary 
state vector, then u(Z) equals the zero vector. The question of the 
stability of vector z is thus reduced to question of the stability of the 
trivial solution h’ = (0, --- , 0) of the differential equation 


dh/dt = Qh + v(h). (7.3) 


We shall obtain conditions for stability under the assumption 
differential equation (7.3) satisfies the hypotheses of Theorem 1, page 
79 of Bellman [1953]. Before this theorem may be used, however, 
the following observations are essential. From the definition of the 
h, it is easy to see its components satisfy the conditions 


m+n 


Yh, = 0 and 2 h, = 0. (7.4) 
k=1 

Therefore, any meaningful solution of differential equation (7.3) must 
also satisfy conditions (7.4). It may be easily shown, although we shall 
not do so here, that, if the initial conditions satisfy conditions (7.4), 
then all solutions of (7.3) satisfy conditions (7.4). This result permits 
us to work directly with the Jacobian matrix Q. 

One of the hypothesis of the stability theorem (Theorem 1, p. 79, 
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Bellman) is that all solutions of the differential equation 
dh/dt = Qh, (7.5) 


approach zero ast-> ©. All solutions of (7.5) approach zero ast > 
if, and only if, the real parts of the characteristic roots of Q are nega- 
tive. (Theorem 7, p. 25, Bellman.) The stability of a vector 2 is, 
therefore, determined by the properties of the matrix Q. 

Let us next examine the structure of the matrix Q. The matrix Q 
is such that it has the following form 


(7.6) 


and the submatrices Q, , Q. , Q; , and Q, have the following forms: 
Q, isa m X m matrix with diagonal elements 
ar | 


and nondiagonal elements 


hii: = = (i (7.8) 
Q. is am X n matrix whose ji-th element is 
Ors. 
= 2, = 2...) Gj = 1, ,N). (7.9) 
Q; isan X m matrix whose ji-th element is 
Q, isan X n matrix with diagonal elements 
2u.. 
di; = 9, ay; /” (7.11) 
and nondiagonal elements 
2u.. 
= (2H (7.12) 


Recall that in the above expressions the carats stand for the evalua- 
tion of the elements of the Jacobian matrix of u(z) at a stationary state 
vector 2’ = (%’, ’). 

The following equations are essential for the. determination of the 
properties of the matrix Q. 
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On, On; 


Expressions for Ou. , , Ou.;/dy; , and du. /dy; are 
similar to (7.13) and (7.14) and may be written down from symmetry 
considerations. 

It is of interest to examine stability in the following five cases. 
The truth of the statements to follow can easily be deduced from equa- 
= (7.7) through (7.14). 

. If a system belongs to Class I and all \,; and y,; are constant, 
ae the Jacobian matrix Q has the form 


0:Q. 
WE (7.15) 


where 0 stands for a square zero matrix. Now from the theory of 
characteristic roots it is known that the sum of the elements on the 
principal diagonal equals the sum of the characteristics roots. In 
this case the sum of the elements on the principal diagonal is zero. 
hence, the sum of the characteristics roots is zero. It follows that all real 
parts of the characteristics roots cannot have the same sign and, there- 
fore, cannot be negative. Thus, no stable stationary states can exist 
in this case. Note the role constant population numbers in the host 
and pathogen populations and constant fitness functions play in the 
instability of this case. 

2. If a system belongs to Class II and all \,; and u,; are constant, 
then the matrix Q has the form 


(7.16) - 


3. If a system belongs to Class III and all \,; and y,; are constant, 
then the matrix Q has the form 


QQ 
(7.17) 


4. If the system belongs to Class IV and all \;; and y,; are constant, 
then the matrix Q will have no submatrices which are necessarily a 
Zero Matrix. 


a 
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5. If all \,; and w,; are nonconstant, then the matrix Q will not 
have a submatrix which is necessarily a zero matrix. This statement 
is true of systems of Class I, II, III and IV. 

The following theorem supplies a sufficient condition for stability 
in cases 2, 3, 4 and 5. 


Theorem 7.1: Construct a matrix Q* from Q as follows. Let 
qk = Qe (kK = 1, ---,m-+n) and set gf = 3 + Gee) fork + 
If Q* is negative definite, then the real parts of the characteristics 
roots of Q are negative. 


Proof: Let \ be a complex characteristic root of Q and let 6, + 76, 
(i? = — 1) be its associated complex characteristic vector of 
dimensions (m + n) X 1. From the definition of characteristic 
roots and vector we have 


Q6; + 18.) = (8, + B2)A. (7.18) 


Now multiply equation (7.18) by (6; — i§.)’, the transpose of the 
complex conjugate of §, + i§.. The result is 


= (GiGi + + + — G:QG,)). (7-19) 


Thus since 8/3, + 662 is positive the sign of the real part of a character- 
istic root depends on the sign of 8{Q6, + 6/Q6.. But 8/Q6, + 6/Q6, = 
6/Q*6, + B5Q*G.. It follows that if the matrix Q* is negative definite, 
then the real parts of all characteristic roots will be negative, which 
completes the proof of the theorem. 

In cases 2, 3, 4 and 5 it is possible to construct a matrix Q* from Q 
so that Q* is negative definite. Therefore, stable stationary states may 
exist in cases 2, 3, 4 and 5. 


8. INTERPRETATIONS AND PRACTICAL CONSIDERATIONS. 


A little consideration will lead to the conclusion that the class of 
stationary state system to which a given host-pathogen system may 
belong depends on the length of the time interval and the size of the 


geographical area under consideration. For example, suppose our - 


host-pathogen system is a given cereal crop and some species of rust. 
If we consider this system with respect to the time interval,0 < ¢ < t,, 
representing a single growing season and a single field of the crop, 
then any stationary state system would probably belong to Class IT; 
since in a given growing season the number of members of the host 
population is essentially constant but the number of members of the 
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pathogen population would probably be increasing. On the other- 
hand, if the time interval is taken to be some period of years and a 
larger geographical area were considered, then our host-pathogen 
system would probably belong to Class IV; since during this period 
of years the number of members of both the host:and pathogen popula- 
tions would probably be changing. It seems likely that the majority 
of host-pathogen systems encountered in practice belong to systems 
of Class IV. 

A question that arises is what procedure should one follow if he 
wishes to construct a host-pathogen system so that (1) it is in a stable 
stationary state, and (2) the number of members in the pathogen 
population at the end of any growing season is less some critical number 
c. In general, the parameters of the system can never be known exactly. 
At the present time, therefore, the most fruitful approach seems to 
be an experimental one. That is, simply construct a set of systems 
consisting of a mixture of varieties and races and observe their be- 
havior over a period of time. Any system satisfying the above condi- 
tions in this time interval would, apparently, be satisfactory from the 
practical point of view. 

Some suggestions for the construction of a host-pathogen systems 
meeting conditions (1) and (2) of the above paragraph for the case of 
equal numbers of varieties and races, constant fitness functions, and 
random association of the host and pathogen are given in the previous 
paper, Mode [1960]. 


9. SUMMARY 


The results of this paper represent a generalization of results given 
in a previous paper. In this paper a characterization of a host-pathogen 
system containing an arbitrary number of host varieties and races of 
the pathogen was given under the assumptions of nonrandom associa- 
tion of the host and pathogen and nonconstant fitness functions. 

Four classes of stationary state systems were defined on the basis 
of constant or nonconstant population numbers in the host and pathogen 
populations. Some methods of finding probability vectors of a station- 
ary state were given for the four classes of systems. 

The four classes of stationary state systems were also checked 
for stability. It was found that if the fitness functions are constant, 
then a stable stationary state may exist in systems of Class II, III, 
or IV but not in systems of Class I. If the fitness functions are not 
constant, then it is possible for a stable stationary state to exist in 
any of the four classes of systems. The possible existence of stable 
stationary states is of considerable practical interest. 
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LATIN SQUARES TO BALANCE IMMEDIATE RESIDUAL, 
AND OTHER ORDER, EFFECTS. 


R. SHEEHE AND IRWIN D. J. Bross 


-Roswell Park Memorial Institute 
Buffalo, New York, U.S.A. 


1. INTRODUCTION 


While the Latin Square designs to be presented have a broad field 
of application, the focus of our discussion will be on their use ‘in clinical 
trials. These extra-balanced designs were, in fact, developed specifically 
to meet an experimental problem that arose in setting up an analgesic 
trial. This specifie problem will be considered briefly in order to illus- 
trate the scientific (as distinguished from purely mathematical) rationale 
for doing balanced experiments. 

The original study plan for the analgesic trial called for five agents 
to be used: a placebo, a standard drug, a new agent, a combination of 
the new and standard agents, and a course of hypnosis. Tach patient 
was to be his own control, but a decision had to be made about the 
duration of each treatment. If the duration of each treatment were 
as long as a week, drop-outs, with resulting incomplete sequences of 
treatments, would become a serious practical problem. In view of 
this, the principal investigator considered a shorter trial period for 
each agent, such as three days, to be preferable. He was confident 
that this period of observation would be long enough to elicit reliable 
responses and, so far as straight chemical carry-over, or residual, effects 
were concerned, analgesic effects would last only a few hours. But 
then he recalled a kind of psychological carry-over effect that he had 
noticed in previous studies. When an effective agent was given after 
a placebo or ineffective agent, it seemed that the effective agent often 
failed. A plausible hypothesis was that the patient had lost confidence 
in the analgesics and it would take some time for his confidence to be 
restored. This applied especially to double-blind studies in which 
the patient often was under the impression that he received the same 
agent all of the time. At this point in the planning, the investigator 
therefore wondered if there was some way to insure that each agent 
tested would be immediately preceded by the placebo an equal number 
of times. Also, since it was possible that several treatments would be, 
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like placebos, relatively ineffective, he wondered if there was a way to 
achieve more complete balance for immediate residual effects by insuring 
that every treatment would be immediately preceded equally often 
by every other treatment. 

The design which achieved this objective was worked out by trial 
and error, but then the study plan was modified to include two new 
treatments. This change emphasized the desirability of having a 
general procedure for constructing predecessor-balanced designs. In 
the literature, E. J. Williams [1] originally set down, in 1949, the con- 
ditions sufficient to produce Latin Squares balanced for immediate 


‘predecessors. Sufficient conditions for more remote predecessors, 


particularly the next-to-last predecessors, were also specified. Williams 
also presented analysis of variance procedures to accompany the designs. 
Raymond, et al, [2] used this type of design and analysis in 1957 for 
a study of tranquillizing drugs in psychoneurosis. In 1952, H. D. 
Patterson [3] considered the more general problem of predecessor- 
balanced designs, not only for square arrangements such as Williams’, 
but also for certain incomplete block arrangements. 

In 1958, J. V. Bradley [4] presented an easily remembered con- 
struction which meets the conditions set down by Williams when the 
number of treatments is even. He also presented additional balancing 
procedures which might be useful in special cases. To round out the 
picture, we shall present here an easily remembered procedure which 
can be used whether the number of treatments is odd or even. In 
addition to the Latin Square and immediate predecessor-balance 
features (properties 1 and 3, resp., to follow), another balance property 
(property 2, to follow) will be noted. We shall discuss the application 
of this, and other extra-balanced designs, in the context of the clinical 
trial. In the appendix, we shall present a detailed proof of the method. 
A second section of the appendix will be devoted to describing a simple 
variation in construction which produces Graeco-Latin Squares when 
the number of treatments is odd. 


2. PROCEDURE 


Before proceeding further, a more formal definition of ‘balance’ is 
appropriate. In addition, some predecessor balance properties, other 
than immediate, will be defined. A design is called balanced with 
respect to the set of immediate predecessors if every treatment is 
immediately preceded equally often by every other treatment. Simi- 
larly, a design may be called balanced with respect to the set of prede- 
cessors of any specified degree of remoteness (e.g., the second degree, 
which involves the next-to-last predecessors), if every treatment is 
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preceded equally often by every other treatment at that degree of 
remoteness. Balance with respect to the set of all predecessors (without 
regard to degree of remoteness) is achieved if every treatment is pre- 
ceded, immediately or more remotely, equally often by every other 
treatment. The term, complete balance, is reserved for the case when 
the set of predecessors (of a specified or unspecified degree of remoteness) 
is the same for every treatment. Thus, balanced designs are not com- 
pletely balanced only because treatments do not precede themselves. 
The design presented here is (1) completely balanced with respect to 
the number of treatments preceding every treatment, (2) balanced 
with respect to the set of all predecessors, and (3) balanced with respect 
to the set of immediate predecessors. 
The procedure for construction is as follows: 


(a) Number the treatments, 7 = 1, --- , n. 
(b) Start with a cyclic n X n Latin Square, i.e. one in which the 
sequence of treatments in the 7th row is7,7 + 1, ---,”,1,2,---,7— 1. 


(c) Interlace each row of the cyclic Latin Square with its own re- 
verse order sequence, i.e. with its mirror image. For example, if n = 5, 
the first row of the cyclic Latin Square reads 1, 2, 3, 4, 5. Its mirror 
image is 5, 4, 3, 2, 1, and when this is interlaced with the first row of 
the original square, the interlaced sequence reads 1, 5, 2, 4, 3, 3, 4, 2, 5, 1. 

(d) Slice the resulting n X 2n figure down the middle, thus forming 
two n X n Latin Squares. The columns of each square refer to the 
order of presentation, from left to right, and the rows refer to individuals. 
Treatments appear in the body of each square. 

It will be found that, when n is even, each of the constructed squares 
has the three desired properties. In this case, either of the two squares 
may be used. When 7 is odd, each of the constructed squares has the 
first property, but not the last two. However, when the two squares 
are considered as a whole, the last two properties are indeed present. 
Consequently, in this case, both of the constructed squares must be 
used. Constructed squares for n = 4 and n = 5 are presented in > 
Tables 1 and 2. The reader may verify, by inspection, that the stated 
properties are present in either case. It will also be noted that, for 
n = 4 or for any even number, in general, the left square is identical 
with that originally presented by Bradley. Proof that all these proper- 
ties hold in general will be offered in the appendix. 


3. DISCUSSION 


Lest the reader be left with the impression that, because the Latin 
Square principle has been used to achieve a desired order balance, a 
Latin Square analysis is advised, we specifically disclaim this as our 
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TABLE 1 
CONSTRUCTED LATIN SQUARES FOR n = 4 


Left Square Right Square 
Order of Presentation Order of Presentation 
Individual Individual 
B 2 4 2 F 
D : @. 4 2 H 2 


intention. In the context of the clinical trial where the problem of 
predecessor balance arose, the primary function of the balancing was 
not to reduce experimental error, but to provide a safeguard against a 
fairly specific danger. This safeguard is present in the designs whether 
we regard them as Latin Squares, or whether we regard individuals as 
blocks. Also, the main reason for using the patient as his own control 
in an analgesic trial is that the response variable is in subjective scale, 
dependent on the value judgment of the individual patient. Thus, if 
a fairly large number of individuals were available for the trial, a 
plausible method of analysis might be to analyze intra-patient compari- 
sons. In fact, the nature of the response variable is not very different 
from that encountered in paired comparison trials. Furthermore, 
the designs here are such that, if each treatment is paired with its 
immediate predecessor, all possible pairs of different treatments will 
be formed equally often. This suggests that an adaptation of the paired 


TABLE 2 
ConstTRUCTED LATIN SQUARES FOR n = 5 
Left Square Right Square 
Order of Presentation Order of Presentation - 
1 4 1 8S 6 
Individual Individual 
A 1 » a F 4 
B 2 1 G 4 § 3 1 2 
C A 2S H 4 4-2 8B 
E 5 4 1 > 2 a 2 3 1 4 5 
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comparison analysis presented by Scheffé [5] might be satisfactory for 
practical purposes. Cochran and Cox [6] give an illustrative Latin 
Square analysis (in the manner of Williams), complicated because of 
balance for immediate residual effects, but they too caution against 
the indiscriminate use of the method when the underlying assumptions 
ire suspect. 

In some clinical situations, neither a Latin Square nor a one-way 
analysis of all the data would be advisable. For example, in a post- 
operative trial of analgesics, Meier et al. [7] report that observations 
made in the first few post-operative hours, when patient pain status 
was most acute, were much more discriminatory for the efficacy of 
the agents than were observations made thereafter. If, as was actually 
done in the cited case, the later observations were dropped .from the 
analysis, the remaining data would still be at least partially balanced. 
Thus, regardless of what analysis might be most appropriate, the 
predecessor balance feature serves as a kind of ‘ace-in-the hole’. If, 
when the study is subsequently reported, the critic should raise the 
question of psychological or other carry-over effects, the investigator 
could reply by pointing to the design precaution that had been taken 
to balance out such effects. 

It is most important, however, to remember that predecessor balanc- 
ing cannot provide perfect protection against carry-over effects. Such 
effects are not likely to be consistent, that is, they may vary considerably 
from one patient to another. To control the carry-over effects, the 
balance would have to be over those patients who exhibited consistent 
carry-over effects, but unfortunately these sub-sets of patients cannot 
be distinguished in advance. Nevertheless the control should be closer 
where there is balance over a single experiment than with the usual 
Latin Square where there would be balance over a hypothetical large 
series of experiments. This more modest justification is sufficient for 
practical purposes. We should not expect any design feature to com- 
pletely solve a deep-rooted experimental problem—all that can reason- 
ably be asked is that the device improve the degree of control. This 
point deserves attention because many useful design features have 
been “‘over-sold”’ so that the investigator gets the impression that he is 
completely covered by his statistical insurance policy. 

What are some other limitations and drawbacks of these designs? 
One limitation is that the designs are not balanced with respect to 
second-degree (and higher-degree) predecessors. A more practical 
limitation (for designs with an odd number of treatments) is that the 
number of patients must be a multiple of 2n in order that balance be 
achieved. For example, with 7 treatments there might be good practical 
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reasons for terminating the experiment after 35 patients or so have 
come in, but 42 patients would be required to balance the design. This 
limitation is less important, however, when the emphasis is on pre- 
cautions rather than analysis. 

Another limitation is that complete balance (as defined) with respect 
to immediate predecessors is not achieved. But by slightly modifying 
the designs given here, complete balance can be achieved by simply 
inserting a “‘zeroth order’ column, identical with the first column, in 
each square. If the data from the zeroth order column are discarded 
in the analysis, it will be found that each treatment is immediately 
preceded equally often by every treatment, including itself. Thus the 
set of immediate predecessors is the same for every treatment. Bradley 
has already pointed this out in his discussion of an even number of 
treatments. It is equally true for an odd number of treatments. But 
the advantages of this further balance would have to be weighed against 
the possible disadvantages. For example, in the post-operative pain 
situation reported by Meier et al. and cited above, it would be especially 
undesirable to discard initial post-operative observations. 

Another simple extra-balancing feature, mentioned by Bradley for 
an even number of treatments but equally applicable for an odd number, 
provides column balance analogous to the row order balance. The 
procedure is to permute the rows of each Latin Square in such a way 
that the first column reads down in the same sequence as the first row 
reads across. Note, however, that the addition of this feature sharply 
limits the number of possible Latin Squares. From a theoretical stand- 
point (e.g. the randomization set justification of the Latin Square 
analysis) the restriction raises some difficulties. Incidentally this same 
objection applies, though usually with less force, to any other extra- 
balanced design. Sir R. A. Fisher has vigorously maintained the 
position that the analysis of data must take into account the restrictions 
of the design and this view has been accepted by a majority of statis- 
ticians. This point was debated by Student (W. G. Gosset) and Fisher 
in the 1920’s. The discussion of the Knut Vic (Knight’s Move) Latin 
Square is directly relevant here since this square contains an extra- 
balance feature. In practice this would mean either a least squares 
analysis or the segregation of individual degrees of freedom associated 
with the restriction. If the Fisherian view is accepted, then there is a 
double liability to extra balance. Not only the computational difficulty 
would be increased but also results would be strongly dependent upon 
the assumptions of the model. (loss of robustness). So we would not 
advise experimentors to use extra-balanced designs simply as a gimmick 
(or an “aesthetic” grounds). There should be some sound practical 
reasons for the additional restrictions. 


€ 


LATIN SQUARES TO BALANCE EFFECTS 


APPENDIX 
Proof 


The basis for proving that the three properties listed in Section 2 
hold in general is the fact that the left square and the right square are 
mirror images of each other, i.e. treatments appear symmetrically about 
the vertical line at which the slice was made in step (d). There is this 
symmetry by virtue of the method of construction: the first treatment 
in any row of the left square appears as the last treatment in the corre- 
sponding row of the right square, the second in the left appears as the 
second-last in the right and so on to the last in the left which appears 
as the first in the right square. 

To prove that each of the constructed squares has the first. property 
(complete balance with respect to number of predecessors), whether n 
is even or odd, it is sufficient to show that each square is, as claimed, 
a Latin Square. For every Latin Square has this property. By con- 
struction, each treatment appears exactly twice in every row of the 
n X 2n figure constructed in step (c). By symmetry, every treatment 
appearing in the left square also appears in the right square, hence each 
treatment must appear just once in any row of the left square and once 
in the corresponding row of the right square. Furthermore, the columns 
of the original cyclic Latin Square remain intact throughout the con- 
struction of each square. Thus each constructed square is arrived at, 
in effect, by permutation of the columns of a cyclic Latin Square. Since 
any permutation of columns of a Latin Square is also a Latin Square, 
both left and right figure are Latin Squares. This demonstrates that 
the first property holds. 

It can now be proved that the second property (balance with re- 
spect to the set of all predecessors) holds for the two squares taken as a 
whole. As shown, the two squares are Latin Squares and are mirror 
images of each other. Consequently, if & different treatments precede 
(and therefore the remaining (n — k — 1) treatments succeed) a given 
treatment in a row of the left square then the remaining (n — k — 1) 
treatments precede (and the k succeed) the given treatment in the 
mirror image row of the right square. That is to say, in any given row 
of the two squares taken together, every treatment is preceded (and 
succeeded) just once by each of the k + (n — k — 1) = (nm — 1) other 
treatments. Taking all n rows into account, every treatment is preceded 
(and succeeded) n times by each of the (n — 1) other treatments. 
This constitutes proof that the second property holds when the two 
constructed squares are taken as a whole. Proof that, when n is even, 
each constructed square has this property will be put off until the third 
property has been dealt with. 
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In order to facilitate the proof of the third property, we, like Bradley, 
adopt the concept of ‘separation’. Let 7 be any treatment in a con- 
structed Latin Square, and let j be its immediate predecessor for a 
given individual. Then let i — j be the difference between the two 
treatments. If the difference is positive, the ‘separation’ is equal to that 
difference. If the difference is negative, the ‘separation’ is equal to n 
plus the difference. This concept of separation can be most easily 
visualized by placing the numbered treatments at equal intervals 
around a circle in clockwise sequence from 1 to n. The separation 
between 7 and j ~ 7 is the number of intervals passed in moving clock- 
wise from j to 7. Note that the separation between any 7 and every 
other possible preceding treatment, j ~ 7, runs from 1 to n — 1 in 
one-to-one correspondence with all treatments other than 7. Thus, 
a design in which, for every treatment, the separations, 1 through 
(n — 1), appear with equal frequency, is balanced with respect to 
immediate predecessors. 

Now, the cyclic nature of columns has been sateieieai in the 
construction of the two squares. Consequently, the sequence of separa- 
tions between successive treatments is the same for every individual 
row in a given square. We may therefore confine our attention to the 
sequence of separations in the first row, since the same sequence will 
apply to every row. 

For n odd, —— in the first row of the left square appear in the 
sequence, 1,n,2,n — 1, --- , (n + 3)/2, (n+ 1)/2. Then the sequence 
of diGenee is ee in sign, +(n — 1), —(n — 2), 
+(n — 3), --- , —(1). The sequence in the right square is reversed, 
with changed signs, +(1), —(2), +(3), --- , -—(n — 1). That is to 
say, a full set of positive integers from 1 to n — 1 and a full set of nega- 
tive integers from —1 to —(m — 1) appear as differences in the first 
row of the two squares taken together. When the negative integers 
are replaced by the corresponding separations, it is seen that in the 
first (or any) row of the two squares, each separation from 1 to (n — 1) 
appears exactly twice. Since this is true for every row, and since each 
treatment appears once in each column, every separation from 1 to 
(n — 1) occurs exactly twice for each treatment. This establishes that 


the third property holds for n odd. 
For n even, the sequence of differences in the left square is, +(n — 1), 
—(n — 2), +(m — 3), --- , —(2), +(1). Again, this alternating se- 


quence is reversed, with changed signs, in the right square, —(1), 
+(2), , —(m — 3), +(m — 2), —(n — 1). Replacing negative 
differences by the corresponding separations, we get, (n. — 1), 2, 
(n — 3), +--+ , (n — 2), 1, for the left square as well as for the right square. 
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This sequence contains all the odd integers in descending sequence from 
(n — 1) to 1, interlaced with all the even integers from 2 to (n — 2) 
in ascending sequence. Thus, in the first (or any) row of either square, 
each separation from 1 to (n — 1) appears exactly once. By the same 
reasoning as for the case when v is odd, this establishes that the third 
property holds for each constructed Latin Square when n is even. 

_ It can now be shown that the second property holds for each square 
when n iseven. It has already been found that the sequence of separa- 
tions in the left square is identical with that in the right when n is even. 
The beginning treatment, together with the sequence of Separations, 
uniquely determines the order of treatment in any row of either square, 
by reason of one-to-one correspondence of treatments with separations. 
In the left square, there is just one row which begins with a given 
treatment, and the right square contains just one row which begins 
with the same treatment. Consequently, every row in the left square 
is identical to one and only one row in the right square. Then the 
rows of the right square could be permuted to arrive at a square which 
is identical with the left square. These two identical squares are 
balanced when taken together, if and only if each is balanced. Permuta- 
tion of rows does not disturb the balance of the right square, and since 
the two taken together have already been shown to be balanced, each 
must be balanced with respect to the set of all predecessors. 


Graeco-Latin Square Construction. 


A slight modification in the procedure for construction, when n is 
odd, produces a Graeco-Latin Square. This modification was first 
noted by our colleague, John E. Dowd. 

For n odd, proceed in the same way as steps (a), (b) and (c). 

Step (d). In the resulting n X 2n figure, replace treatment numbers 
appearing in the odd numbered columns by Latin letters and the treat- 
ments in even numbered columns by Greek letters. 

Now consider all the n’ mutually exclusive pairs of Latin and Greek 
letters in adjacent columns. It will be found that the pairs appear in 
the vertical and horizontal order required to form an n X n Graeco- 
Latin Square. 
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ON THE STATISTICAL THEORY OF A ROVING CREEL 
CENSUS OF FISHERMEN’ 


D. S. Rosson 
Cornell University, Ithaca, N. Y., U.S.A 


SUMMARY 


In order to estimate the day’s total catch from a fishery an enumer- 
ator roves through the fishing area interviewing fishermen as he en- 
counters them to determine the number n of fish caught and the time 
texpended. The interviewer is assumed to (i) start his trip at a randomly 
chosen point along a well defined route which completely covers the 
fishery, (ii) choose his initial direction at random from the two alter- 
natives, and (iii) travel at a constant rate of c circuits per day. If the 
catch rate n/t at time of interview is an unbiased estimator of a fisher- 
man’s catch rate for his completed trip and if the fishermen’s move- 
ments relative to the interviewer’s path never exceed the interviewer’s 
rate c, then rn/ct, summed over all interviews, is an unbiased estimator 
of the day’s total catch. The unit of time is one day, r is the number 
of times the fisherman was interviewed, and n/t is the catch rate at the 
r’th interview. 

Unbiasedness of n/t implies that the waiting times to first catch and 
from first to second catch are identically distributed chance variables, 
and that all waiting times between successive catches have the same 
expected value. If waiting times are independent, then unbiasedness 
implies that fishing is a Poisson process. 


INTRODUCTION 


Fishing, as every fisherman knows, is a chance process. Skill im- 
proves the chances, but a multitude of unknown factors governing fish 
behavior remain to confound even the most experienced fisherman, 
and for most of us catching a fish is still largely a matter of chance. 
One of the basic factors controlling fish catches, of course, is popula- 
tion size; in turn, however, fish population size may itself be strongly 
influenced through the efforts of the more gifted and more fortunate 
fishermen. In order to maximize the fishermen’s chances insofar as 
they are influenced by this factor, the fishery manager attempts to 
maintain the population size and composition at an optimum level 
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through such practices as stocking fish and regulating the catch. These 
management decisions must be based upon objective information re- 
garding the number, size, and age composition of the fishermen’s 
catch, and one of the more important field techniques which have 
evolved for obtaining such information is the so-called creel census of 
fishermen. 

In the creel census the fishery manager or his representative makes 
direct observations on the fishing process, interviewing fishermen in 
action to determine the kinds and numbers of fish taken and the rates 
at which they are caught. Ordinarily, the fishermen so interviewed 
represent only a sample of the fishermen present, so that the creel 
census is, in fact, a sample census. Moreover, in many types of creel 
census only information on incomplete fishing trips is obtained; that 
is, the fisherman is interviewed while fishing, providing information 
on his fishing trip up to the time of interview, and his fortunes after 
the interview remain unknown to the fishery manager. Such is the 
case in the type known as the roving creel census, in which the enumer- 
ator moves through the fishing area interviewing fishermen as he en- 
counters them, and it is this commonly employed method of sampling 
and its associated methods of estimation which shall be examined in 
some detail here. 

Estimation of the total day’s catch from the fishery on the basis 
of the roving interviewer’s data appears to present a unique combina- 
tion of statistical problems in the theory of sampling and estimation. 
Some distinctive features of the roving creel census are (i) the open 
end to the sample—the number of interviews in the sample is not 
predetermined but depends, rather, upon the number and distribution 
of fishermen present, (ii) the sample of fishermen obtained by following 
some rational route through the fishery constitutes a systematic rather 
than a random sample, (iii) the probability of interviewing any given 
fisherman depends in some manner upon how long he fishes, and (iv) 
only incomplete information is obtained for any one fisherman. 

In examining this problem we shall first specify a well defined, 
roving sampling procedure and then treat the estimation problem under 
the simplifying assumption that catch rate at the time of interview is 
an unbiased estimator of that fisherman’s catch rate for his completed 
trip. Later, we consider the implications of this assumption as it relates 
to the nature of the fishing process. 


THE SAMPLING AND ESTIMATION PROCEDURE 


A specific description of the procedure followed by a roving inter- 
viewer probably would not apply in all detail to any single creel census 
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ever conducted, since each fishery presents circumstances peculiar 
unto itself. The sampling process to be described here is necessarily 
a specific idealization of the general plan of roving through the fishery, 
interviewing fishermen as they are encountered; as always, the idealized 
plan is not actually attainable in practice, but can be approximated to 
a reasonable degree. 

_ We shall assume that some systematic route which gives complete 
coverage is plotted through the fishing area, that the interviewer 
starts his trip at the beginning of the fishing day from a randomly 
chosen point of departure along this route, that he chooses at random 
one of the two alternative directions to travel and then proceeds at a 
constant rate of travel until the end of the day. 

The line denoting the route of the interviewer effectively reduces a 
fishing area in two dimensions to a line in one dimension, and since 
the route is closed—that is, a complete coverage of the fishing area will 
bring the interviewer back to his starting point—then the line may be 
represented conceptually as a circle. A fisherman’s location then 
corresponds to a point on the circumference of the circle, determined 
by the point on the interviewer's route at which he would pass that 
fisherman's location in the fishing area. The dimension of time may 
be introduced by letting the radius of the circle represent the length 
of the fishing day. In this way a fisherman’s location in both time 
and space can be represented by a point within the circle; his location 
in the fishery determines the radius vector upon which he lies and the 
time of day determines a point on this radius vector. It is convenient 
here to regard the time axis as extending toward the center of the 
circle, so that the time is 1 (end of the day) at the center and 0 (beginning 
of the day) on the circumference. For example, if a fisherman is station- 
ary, then his entire trip can be plotted as a segment of a radius vector 
as shown in Figure la; the particular radius vector is determined by 


FIGURE la 


THE MappPinG oF A STATIONARY FISHERMAN’S TRIP WHICH STARTS AT THE 
BEGINNING OF THE Day AND ContTiINUES UNTIL TimE T aT THE LocaTIon L. 


his (fixed) location and the segment extends inward from the time he 
starts fishing to the time he stops. 
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The interviewer's trip, in this framework, then becomes a regular 
spiral extending from his randomly chosen starting point on the cir- 
cumference to the center point of the circle. The regularity of the 
spiral is a direct consequence of the assumption of a constant rate of 
travel. This is illustrated in figure 1b where the interviewer's rate of 
travel is arbitrarily taken to be one complete circuit of the fishery per 
day, and the direction of travel is arbitrarily taken clockwise. 

In the particular combination of circumstances described in Figure 
lb, the interviewer’s trip does not intersect with the fisherman's trip, 


FIGURE 1b 


THE MappinG OF AN INTERVIEWER TRIP WHICH STARTS AT THE Point S AND 
MovEs IN A CLOCKWISE DIRECTION AT THE RATE oF ONE Circuit Per Day 


for the fisherman had already left by the time the interviewer reached 
that location. Had the interviewer, traveling in this same direction, 
chosen his starting point anywhere on the arc A shown in Figure Ic 


s 
FIGURE ic 


THE RANGE OF INTERVIEWER STARTING Points WHIcH LEAD 
TO AN ENCOUNTER WITH THE FISHERMAN AT L, Arc A FOR 
CiockwisE Trips, Arc B ror CouNTER-CLOCKWISE. 


then he would have encountered this fisherman, or traveling in the 
opposite direction and starting anywhere on arc B would have led to an 
encounter. Since the probability distribution for the starting point 
is uniform on the circumference of the circle, the probability that it 
will fall on arc A (or B) is simply the length of the arc expressed as a 
fraction of the entire circumference. Clearly, this relative length of A 
(and of B) is simply 7, the length of the fisherman’s stay; for in order 
to reach the fisherman’s location L at exactly the time T when he 
stopped fishing, the interviewer would have had to start his trip at a 
point S just far enough in back of L so as to reach L by traveling for 
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exactly a time 7’, hence covering a T’th of the circumference. The 
probability that this fisherman would be interviewed is therefore 


P(interview) = P( clockwise travel )-P( interview | clockwise travel ) 
+ P(counterclockwise) -P(interview | counterclockwise) 
= 3T + 3T. 


Thus, if the interviewer's rate of travel is one circuit per day, then a 
stationary fisherman’s probability of interview is equal to the fraction 
of a day that he fishes. 

If the interviewer's rate of travel is not 1, then this result no longer 
holds. It is obvious, for example, that, if the interviewer makes 2 
complete circuits per day, then every fisherman who fishes for more 
than half a day is certain to be interviewed at least once, and may be 
contacted twice. Figure 2a, again employing a stationary fisherman, 


2(1-T) = P(ONE INTERVIEW) 


2T-1= P(TWO INTERVIEWS) 


FIGURE 2a 
A MappPIinG oF A STATIONARY FISHERMAN’S Trip oF LENGTH 7' > 1/2 AND 
AN INTERVIEWER’S TRIP AT A RATE OF c = 2 COMPLETE CrRCUITS PER Day. 


illustrates this situation and indicates the ranges of starting points 
which would result in 1 and 2 contacts between interviewer and fisher- 
man. In order for a single contact to occur here, the interviewer must 
pass the fisherman’s location for the second time between time 7' and 
time 1. Since he is traveling at the rate of 2 circuits per day, this 
means that the range of starting points which will accomplish this is of 
relative length 2(1 — T). The remaining range of starting points, of 
relative length 1 — 2(1 — T), will result in 2 contacts between inter- 
viewer and fisherman. Figure 2b illustrates the case of a fisherman 
whose trip length is less than a half day, and indicates the range of 
starting points which will produce 0 and | contact. In order for a 
single contact to be made the interviewer must pass the fisherman’s 
location between time 0 and time 7’, and the probability of this occurring 
is 2T. We have here ignored the feature of randomized direction of 
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'-2T=P(NO INTERVIEW) 


2T=P (ONE INTERVIEW) 


FIGURE 2b 


A Mappine oF INTERVIEWER AND STATIONARY FISHERMAN 
FOR THE CasEc = 2anDT < l/c. 


travel because of the trivial role it plays in the case of stationary 
fishermen. 

In the more general case where the interviewer makes c complete 
circuits of the fishery in a day then a stationary fisherman whose trip 
length is T will be interviewed either [c7’] or [cT] + 1 times, where 
[cT] denotes the largest integer contained in c7. By the same argument 
employed earlier, the probability of exactly [c7'] interviews is 
[cT] + 1 — cT, and the probability of [cT] + 1 interviews is then 
cT — [cT]. The expected number of interviews of this stationary 
fisherman is therefore 


[eT + 1 — cP) + + — = 


for any constant travel rate c > 0. 

If the fishermen themselves are moving about in the fishery then 
the interviewing process becomes somewhat more complicated anal- 
ytically, and apparently unmanageable from the viewpoint of estima- 
tion. We first observe that if the fisherman’s rate of movement relative 
to the interviewer’s path never exceeds the interviewer's rate c then 
the expected number of interviews remains at cT. To demonstrate 
this we have exhibited in Figure 3a the path of a slow-moving fisherman, 


FIGURE 3a 


ILLUSTRATION OF A PaTH TAKEN BY A MovinG FisHERMAN CovERs A 
FRACTION A OF THE FisHeERY DuRING A FRACTION 7' OF THE Day. 
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traveling in a clockwise direction, who covers a relative distance A 
in a time 7, thus moving at an average rate of § = A/T. As seen in 
l'igure 3b, where the interviewer's rate is taken to be c = 1, the range 


clockwise route counterwise route 
FIGURE 3b 


ILLUSTRATION SHOWING THE RANGE OF STARTING PoINTs OF INTERVIEWER 
Tries Wuicn WILL INTERSECT THE MovING FISHERMAN IN 3a. 


of interviewer starting points which lead to an interview is of relative 
length 7(1 — 8) for clockwise interviewer trips and relative length 
T(1 + 8) for counterclockwise trips. Thus, the probability of an inter- 
view is T(1 — 3)/2 + T(1 — 5)/2 = T. More generally, we see that 
if, for an arbitrary interviewer rate c, the fisherman’s rate s is uniformly 
less than c, then the probability distribution of the number r of inter- 
views for clockwise trips is 


Pr = (Te — = (Te 


+), 
and for counterclockwise trips is 


Pr = (Te + )]) = 


=1-—P¢ = [Te+8) + 0), 
and, again, the expected number of interviews is 
&(r) = 4&(r | clockwise trip) + 38&(r | counterclockwise) 
= — 3) + +8) 
=clT. 


As soon as the fisherman’s rate s exceeds the interviewer’s rate c, 
the roles of the two paths effectively become reversed in so far as they 
determine the number of interviews. In effect, the hunter becomes 
the hunted, and an interview occurs now if the fisherman overtakes 
the interviewer. Consequently, if a fisherman travels at a constant 
rate s > c for a time T then the expected number of times he will be 
interviewed during this period is s7' rather than c7. If during a period 
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of length 7’ the fisherman's rate s > c is not constant but is s, for a 
time 7, then s, for a time 7, , --- , then s, for a time ~% , 
7 + ++: + 7 = T, then his expected number of interviews during 
this period is §T, where 


1 k 
= T 8;T; . 
It then follows that, if s varies continuously, s = s(r) > c, in the interval 
0 <r < T, the expected number of interviews is 


= aT = i 


For a completely arbitrary type of fishing trip of duration 7’, the fisher- 
man’s rate of movement s(r) may exceed c part of the time, and part 
of the time not, so in general 


g(r) dr + as) dr, 


Je 


where 
3-3. = {r|O0<7r<T, 87) 
3.= {r|0<7<T, >c}. 


Since there will be only finitely many discontinuities in s(r), we may 
simply write 


&r) =cT + -—oT., 


where T., is the Riemann measure of 3, , or the total length of time 
that s(r) > c, and § is the average of s(r) over 3, , 


sade. 


e 


The two components of &(r), cT and (§ — c)T, , are not individually 
estimable on the basis of the interviewer's data—unless, perhaps, he 
also makes some quantitative measure of the fisherman’s rate of move- 
ment when he is approached for an interview. For this reason we shall, 
from this point onward, assume that no fisherman’s rate of movement 
ever exceeds that of the interviewer (JT. = 0), and this is most easily 
accomplished by imposing the restriction that the fisherman’s time in 
motion is not counted as fishing time, nor is he interviewed while in 
motion. 

At this point it is worth noting that while the arguments so far 
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have been directed at the case of a single interviewer traveling at the 
constant rate of c circuits per day, all arguments apply just as well to 
the case of k interviewers equally spaced along the route and traveling 
at the constant rate of c/k. The combined data of k such interviewers 
is equivalent in every way to the data of a single interviewer traveling 
k times as fast; that is, traveling at the rate c. 

The preceding results apply to each fisherman who is present in 
the fishery sometime during the course of the day; consequently, if 
there are M fishermen present with trip lengths T, , --- , 7 , then 
the expected number of interviews is c(7, + --- + Tx). Ife < 1, 
then this is also the expected number of different fishermen interviewed; 
otherwise, the expected number of interviews exceeds the expected 
number of different fishermen contacted by an amount c >* (7; — 1/c), 
where the sum : ng extends over all fishermen whose effort 7’; exceeds 1/c. 

In his interviews the enumerator determines the number n of fish 
caught and the amount of effort (time) ¢ expended up to the time of 
interview, thus enabling him to estimate the catch rate for each inter- 
viewed fisherman. Conventional methods of estimating the day’s 
total catch from the fishery by a roving creel census are based upon the 
assumption that this catch rate n/t at the time of interview is an un- 
biased estimator of the catch rate N/T for the completed trip. We 
shall examine the implications of this assumption in a later section; 
for the moment, however, we will accept it. If the fisherman is inter- 
viewed r times during his trip, then r such unbiased estimates of N/T 
are avaliable; since the information is cumulative, however, the last 
interview contains all of the information of those preceding it, so the 
last catch rate would be used to estimate that fisherman’s N/T. Using 
the established fact that the expected value of r, the number of inter- 
views, is c7’, we shall then see that rn/ct is an unbiased estimate of N, 
and summing this over all m fishermen interviewed then gives an un- 
biased estimate of the total number of fish caught by all M fishermen 
during the day. 

A simple numerical example shown in Figure 4 illustrates this point. 
Here the interviewer's rate of travel is c = 3 circuits per day; two 
stationary fishermen are present during the day, one starting at the 
beginning of the day and fishing a fraction T, = § of the day and the 
second, who is 3 of a circuit behind the first on the interviewer's (clock- 
wise) route, starts fishing at time 3 and fishes until time } (JT. = $). 
The first fisherman will then be interviewed either once or twice, 7, = 1 
or 2, and r, = Oor I, giving four different possible outcomes for r, and r, 
The ranges of interviewer starting points which produce each of these 
four outcomes are shown in Figure 4 along with their relative lengths, 
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FIGURE 4 


A Numericau Examp_Le WitTH Two STATIONARY FISHERMEN SHOWING THE 
PROBABILITY DISTRIBUTION OF NUMBER OF INTERVIEWS WHEN 
THE INTERVIEWER’S TRAVEL RATE Isc = 3/2. 


or probability measures. The expected value of [(r.N,/T,) + 
(r.N2/T:)|/c in this example is then 


2[1 (N, 3 2%) 5 ] 
_2[21 (Ms), 3 
E + 16 (¥: 
2121 7/8 3 /8 
2| + 3 | = +N, . 


The sampling error of an estimate of this type may be regarded as a 


C (m) (M) 7 (M) 
where >..,, is a sum over the m fishermen sampled and }>,y) is a sum 
over the M fishermen present, the first due to the sampling of fishermen 
and the second due to the sampling of the fishing process of the selected 
fisherman—that is, due to the incompleteness of the fishing trip at the 
time of interview. These components are statistically uncorrelated 
because of our assumption that for any given t,0 < t < T, the observed 
catch rate n/t is an unbiased estimator of N/7’. Unfortunately, how- 
ever, the sampling variance and its corresponding components are in 
general not estimable from the data obtained in a roving census because 
of the systematic nature of the sample and unequal, unknown probabili- 
ties of selection. 

We turn now to an examination of our simplifying assumption that 
catch rate at time of interview is an unbiased estimator of the catch 
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rate for the completed trip. This assumption clearly relates to the 
nature of the stochastic fishing process, and in the next section we will 
study this relationship in some detail. 


IMPLICATIONS OF THE ASSUMPTION 
OF AN UNBIASED CATCH RATE 


For this discussion we again restrict our attention to a single fisher- 
man on a given day and on a fishing trip of given duration T. Since 
fishing is a chance process, the number WN; of fish which will be caught 
during this period is a chance variable. The basic chance variable of 
the fishing process, however, is the amount of time required to catch 
a fish, or the waiting time between successive catches. The number 
N, is simply the number of successful waiting times contained in an 
interval of length 7; hence the probability distribution of N7 depends 
basically upon the distribution of waiting times. . 

Our fisherman is being subjected to a sequence of waiting times 
W, , W2 , Ws , -*+ between successive catches, and, since he terminates 
the process after a period of length 7’, then the number N; of fish caught 
will be that number K for which w, + wo + --- + wr < 
T<w,twet--) + we + wes. Thus, if Wx denotes the sum 
of the first K waiting times, then the probability distribution of Nr is 


= K) = P(Wz <T < 
The waiting times w; Wo , Ws , are non-negative chance variables, 


so the event N; = K is also defined by excluding the event Wz > T 
from the event Wx., > T; consequently, an equivalent expression is 


P(N; = K) = P(Wrii > T) — P(Wx > T), 
or, in the classical form for waiting time problems, 
= K) = < T) — P(Wrai < T). (1) 


The catch n, on hand at the time of interview of our fisherman is” 
also a chance variable whose distribution then depends ont, T and N;. 
To compute this distribution we note that the joint event n, > k and 
N, = K occurs if and only if Wi., < tand We < T < Wsx,, ; thus, 
the conditional probability of n, > k, given ?t, T and Nz, is 


< Wr < T) = < < 


P(Wx < T) — < T) 
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From this we obtain the distribution of n, by subtraction, 
P(n, = k|t,7, 
= P(n, >k—- 1 | ¢,T, Nr) — P(n, > 


This derivation of the distribution of N and n, implicitly assumes 
that the duration 7 of the fishing trip is an arbitrary constant, de- 
termined independently of the outcomes w, , w. , «+: of the waiting 
process. While this is possibly true of the fishing process, as when the 
fisherman decides in advance just how long he will fish, we must also 
acknowledge the existence of other possible stopping rules, including 
those which are a sequential function of the waiting times. When we 
impose the restriction of unbiasedness, however, we require that all 
admissible stopping rules exhibit the same dependence on waiting 
times, and, since the rule in which T is an arbitrary constant must be 
included among the possible stopping rules, then all admissible rules 
must be independent of waiting times. This is clearly seen in the case 
N, = 1, where the condition of unbiasedness becomes 


> 0| t,T, Nr = 1) = P(w, < t| t, 7, Nr = 1) 
Here we see that the conditional distribution of w, , given that fishing 
stops at time 7 with N; = 1 fish in the creel, must be the uniform 
distribution from 0 to T, regardless of the stopping rule. Now if the 
process satisfies this condition when 7’ is an arbitrary constant then 
every other stopping rule for which this condition is satisfied can contain 
no more information concerning w, than does the rule in which 7 is 
arbitrary. We shall continue to assume, therefore, that 7 is simply a 
preassigned constant for each fisherman. 

The preceding arguments also assume that the restriction of un- 
biasedness applies conditionally for all ¢, and we shall now demonstrate 
that this must, in fact, be true if we require that unbiasedness hold 
for an arbitrary travel rate c. To see this, let 


hr.x(t) = t,T,Ny= K), 


so that the condition of unbiasedness becomes 


For a stationary fisherman the distribution of ¢, the time of the last 
interview, is uniform from T — (1/c) to T when T > (1/c). Since the 
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choice of c, the interviewer’s travel rate, is arbitrary relative to the 
time T fished by this particular fisherman, then for all ¢ > (1/T) 


T 

= cf hr, x(d) dt, 
T-(1/e) 

or, letting | 


Hy,x(b => dt, 
then 


K (1 
Hr a(t) ~ Hra(t 7) = 
Letting x = T — (1/c) we then have 
K 
Hy,x(z) = Hr,x(T) T — 2), 
and upon differentiating with respect to z, 
hr,x(z) = 


for 0 < x < T. This justifies our earlier assertion that our unbiased 
condition on n/t implies that 


for now we may write 


= &(n |r, 0} = 


Returning to the distribution of n, , we may express the unbiasedness 
condition as 


=k|t,7,Nr=K =o, 
k=0 
or, equivalently, 
1K 
> k| Nr = K) = 
k=0 
In terms of the distribution of waiting times this identity then becomes 


[P(W,. < t, We < T).— < t, Wee < T)) 
(2) 
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which must hold for all and T,0 < t < T < o and for every positive 
integer K (for K = 0 the bloatity i is trivial). 

The identity (2) imposes severe restrictions on the nature of the 
fishing process as it is described by the distribution of waiting times. 
For the special case K = 1; i.e., the fisherman's total catch is one fish, 
this identity reveals several facts. If the cumulative distribution func- 
tion of. w, , the time required to catch the first fish, is denoted by F(w,), 

F(z) = < 2), 
and, if 


G(y | z) = P(w. < y| = 2), 
then 


= — 2) dF(z) = PW, <2) 
and (2) becomes, for K = 1, 


F(t) — [ ‘G(T — 2) aF(2) = (F(T) — H(T)). 


Differentiating both sides with respect to ¢ we obtain a new identity 


Now setting ¢ = T we find that 
A(T) = — (DY), (4) 
so 
= 1) = THT). (5) 
Differentiating (4) with respect to 7’ gives 
MT) = 


that is, the density function of the sum of the first two waiting times 

at the point 7 must be equal to 7 times the negative derivative of { 

at T. This, in turn, implies that the density function f of the first waiting 

time is monotonically decreasing; that is, if0 << w < w’ then f(w) > f(w’). 
Combining (3) and (4) we have 


GT t| dD) = (7) 


and now differentiating with respect to T we find 
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{OgT —t|) = -f'(7), 
or 
f(w,)g(we | wi) = —f’(wi + (8) 


Thus, the joint distribution of the first and second waiting times, w, 
and w., must be equal to the negative derivative of the density function 

'f, evaluated at the point w, + w,. Furthermore, upon integrating 
w, out of this joint density function we find the marginal density func- 
tion of w, to be 


| wr) deo, = +) den, = 


that is, w, and w. must be identically distributed. Note that for fixed w, 
the mean value of w. must be 


| = [1 — F(ws)]/f(w). 


If w, and w, are independent as well as being identically distributed, 
then we see, upon returning to (7) and differentiating with respect to ¢, 
that 


— — + — = 90, 
or, putting ¢ = T, 
= —{0) = (say) 


so that the distribution f(T) of the first and second waiting times must 
be the exponential 


f(T) = 


From (8) we also see that the converse is true; that is, if w, and w, are 
identically, exponentially distributed then they must be independent 
if our condition of unbiasedness is to be fulfilled. 
Most of these arguments and conclusions apply in general to the 
original identity (2) holding for arbitrary K. In general, (4) becomes 


so that (5) becomes 


T dPW. <1) 


= K) = 


The relation (6) becomes, in general, 
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ST) _ (—1)*T*f*(T) 

where {“’(T) is the K’th derivative of f(T). Combining these las: 


two equations we find that 


P(Nr = K) = (—1) 


K-1 (10) 


for K > 0, and 
P(N, = 0) = 1 — F(T). 


From (10) it is seen that the generating function Q,(s) of the distri- 
bution of N, takes the form 


Q7(s) = K) 
= 1 — + — per - 


which is the Taylor series expansion of 
Q7(s) = 1 — F(T — Ts) (11) 

about the point JT. The factorial moments of Nz are therefore 

— 1) (Nr — +1) = (-1) TO). 
In particular, the mean value of Nz is 

ur = THO), 
and the variance is 
or = — wr) — T’f'(0). 


The earlier argument for K = 1 which led to (8) and the conclusion 
that 


dP(w, < x) = dP(w, < 2), 
now gives for K = 2 
dP(w; < 2) + dP(w. + ws < x) = dP(w, < 2) + dP(w, + vw < 2), 
and for K = 3 
dP(w, < x) + dP(ws + w, < x) + dP(w. + ws + 2) 
= dP(w, < x) + dP(w, + w. < 2) + dP(w, + vw. + ws < 2), 


or, in general, 
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A 
> dP(Wri Wi 2) = dP(W, < 2) 
(12) 


The implication for K + 1 waiting times is thus not quite as specific 
_ as it was for 2 waiting times—namely, that the first two must be identi- 
cally distributed. However, multiplying the system (12) by z and 
integrating does show that all waiting times must have the same expected 
value. 

The conclusions concerning tathindadnons and the exponential dis- 
tribution do extend to arbitrary K. If waiting times are mutually 
independent, then {(w,) = f(w2) is exponential, and from (9) we see 
that the distribution of the sum Wx is simply the K-fold convolution 
of f, which implies that all waiting times are distributed according to f. 
As is well known, independent and identically exponentially distributed 
waiting times generate a Poisson distribution for number of occurrences, 
and in this case (10) gives us 


-or 


P(Nr = K) =e 


and from (11) we now obtain the Poisson generating function 

When fishing is a Poisson process, the conditional distribution of n, , 
the number of fish on hand at the time of interview, is binomial with 


parameters N; and p = t/T, for substituting the exponential waiting 
time distribution into our formulas we obtain 


=k | t,T,Nr = K) = K! mf o(T — de 


Thus, for all t, 0 < ¢ < T, the catch rate at ¢ is unbiased, 


Me = 
1, Nr) = = 


and here the conditional variance of the catch rate estimate is 
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The above variance is conditional upon ¢ as well as T and N; ; 
since we know the conditional distribution of t, the time of the: last 
interview, we can in this case compute the value of the second com- 
ponent of variance of the estimator of }> N—that is, we can now 


compute 
ifm _ Ne) 


for fixed 7’s and N,’s. Assuming that the different fishermen are 
undergoing independent Poisson processes, we see that 


1 Ne) [ (1 _ 2) 


P(r = = 1 —cT + (eT) = 1 — P(r = 0), 


and where the conditional distribution of ¢ given r is uniform. For 
r = [cT], t must fall in the interval (T — 1/c, [cT]/c) of length 
(1/c)P(r = [cT]) and for r = [c7] + 1, ¢ must fall in the interval 
({[eT]/c, T) of length (i/c)P(r = [eT] + 1). From this joint distribution 
of r and ¢ we find the second component of variance to be 


where 


cT —1 
+ (1 (GH — 1) - 


C 


As c gets large then the probability of interview approaches 1 for every 
fisherman, and the time ¢ converges to 7’, so this component (as well as 
the first component) approaches 0. 

Examples of other stochastic processes which satisfy our condition 
(2) of unbiasedness can now be easily constructed by the simple device 
of mixing Poisson processes. Since unbiasedness holds conditionally 
for a Poisson process with parameter @7' then it must also hold un- 
conditionally when @ is assigned some probability measure P(@). In 
effect, then, the fisherman is assigned a value of 6 for his trip, where 
the 6 is selected according to the distribution P(@). The density func- 
tion of the waiting time to first catch is then 


ftw.) = 


All waiting times will have this same marginal distribution, and 
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their joint distribution will be 


“0 


The covariance between any two will be the same as the covariance of 
w, and w, , which can be written in general as 


2 
cov (w, , = — F(x)) dx { xf(x) ax} 
0 0 
and which in the present case becomes 


cov , W2) = [PO POP = var (4). 


The distribution of N; is simply 


P(Nr = K) = = K | 6) dP(6) 


K 


which, of course, could also be obtained from (10) using this function f. 
Since P(n, | t, T, Nr, 6) is binomially distributed independently of 6, 
then this is also the unconditional distribution, 


P(n, | t,T, Nz) = | Nx , 6) aP(0) 


~ An. T 
regardless of the mixing distribution P(6). 
To take a specific example of this mixing procedure, let P(6) be the 


gamma distribution 
B* 9° -1 
I'(a) 


dP(6) = dé, 


ja) = (2 +1) 


and 


B 


433 
\ 
4 
q 
a 
ae 


434 BIOMETRICS, SEPTEMBER 1961 


The distribution of N; , a gamma mixture of Poisson distributions, is 
from (10) the negative binomial 


P(N; = K) = + 


A negative binomial process derived in this manner is of no particular 
interest as such, because the basic process which the fisherman follows 
on any given trip is still Poisson. We may now change our viewpoint, 
however, and regard this method of derivation as merely a device for 
showing that fishing can be a negative binomial process and still satisfy 
our condition that catch rate is unbiased. We stand on weaker ground, 
of course, if we know that a fishing trip is a negative binomial process, 
for we cannot then be certain of unbiasedness as we could in the Poisson 
case. The same would be true of any other distribution we might 
generate by this device of mixing Poisson processes. 


AN EXAMPLE OF BIAS 


To illustrate the magnitude of the bias which might arise if the 
fishing process does not satisfy the conditions of the preceeding section 
we may consider a simple example where successive waiting times are 
independent and identically but not exponentially distributed. In 
particular, if we let waiting time w be proportional to a chi-square 
variable with an even number of degrees of freedom, say w = x3,/28, 
then the computation problem remains relatively simple. For v = 1, 
of course, we again obtain the exponential distribution of waiting time. 
For v > 1 the waiting time distribution will no longer be a decreasing 
function with its maximum at the origin; rather, it will increase to 
a maximum at the point w = (v — 1)/8 > 0 and then decrease. We 
are thus assuming that the wait between two successive catches is 
more likely to last 10-20 minutes, say, than 0-10 minutes, and that 
perhaps a wait of 20-30 minutes is still more likely than a 10-20 minute 
wait, though eventually the probability reaches a maximum and then 
steadily decreases. This might be the situation, for example, in fishing 
for some of the more voracious species of game fish which tend to over- 
disperse themselves, exhibiting a negative or repulsive contagion in 
their spacial distribution. Catching one of these fish almost auto- 
matically precludes the immediate capture of another. 

For illustrative purposes, then, let » = 2, so that the distribution 
of each waiting time w, is 


dP(w, < w) = dee. 


Then if the interviewer's travel rate is ¢ < 1 it can be easily shown 


2 
= 
4 
— 
7 
| 


A ROVING CREEL CENSUS 


from the preceding sections that for a fixed 7’ 
e( | 1) = 
| 18 + 


and 


o( 2) 400 + 127 + 656 + 2767 
300 + 6087 


Since 0 < B < &, we see that 


1 


|W, =1) <8, 
6 
and 


13. «9 
-2) 20° 


447 


Thus, if a fisherman catches exactly 1 fish on his trip then his expected 
contribution to the interviewer's estimate of the total catch is some- 
where between 1/2 and 5/6 rather than the desired 1. A fisherman 
who catches 2 fish may be expected to contribute somewhere between 
4/3 and to the estimate, depending upon how long he fishes. If 
he fishes all day (7’ = 1) to catch the 2 fish then his expected contribu- 
tion is between 1.373 and 1.533 fish rather than 2; if he fishes only a 
very short time, say 7 = .01, then his expected contribution to the 
estimate is somewhere between 1.333 and 108.783, and is more likely 
to be toward the large end since that fisherman’s 8-parameter is likely 
to be large. 

Clearly, if a collection of fishermen are present with varying T’s and 
6's, the interviewer's estimate in this case could not be expected to bear 
any particular resemblance to the actual total catch. 


DISCUSSION 


The major weakness of the roving creel census of incompleted 
fishing trips as a technique for estimating total catch is that the bias of 
estimation depends on the basic nature of the stochastic fishing process, 
and this, in general, is unknown. We can only speculate as to the 
nature of the fishing process and whether or not it satisfies the conditions 
of unbiasedness, but in some circumstances the answer is obvious. 
If the fish occur in schools and several may be captured whenever a 
school is encountered, then waiting time to first fish is waiting time to 
first school, but waiting time from first to second fish may be waiting 
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time between catches within a school, and these two are not identically 
distributed chance variables. Or the habit of the fisherman may be 
to visit the more productive locations during the first part of his trip 
and then spend some time exploring for more. <A variety of arguments 
could be put forward for unequal expected waiting times in violation of 
our conditions for unbiasedness, and the resulting bias could be of 
considerable magnitude. 

The only sure way of avoiding this problem in a creel census is to 
use some other technique in place of the roving interviewer, to make 
the creel census distribution-free in the sense of an ordinary sample 
survey method. One effective, distribution-free technique is to station 
interviewers at the access points of the fishery to obtain information 
on completed trips. When no well defined access points exist, inter- 
viewers may be assigned randomly chosen area segments of the fishery 
in which they are to obtain completed-trip information [1]. References 
to these and other techniques may be found in an extensive bibliography 
on game kill and creel census procedures prepared by V. Schultz [2]. 

It should be noted that the implications concerning the fishing 
process which were deduced from the condition of unbiasedness of the 
catch rate at time of interview apply to the general problem of pre- 
dicting the number of events in a stochastic process. If ie numer 
n, of events in a time-continuous process is observed after the process 
has been in operation for a time ¢, then in order for n,7'/t to be an un- 
biased predictor of the number of ewents N» which,will have occurred 
at some specified later time 7, the process must satisfy the conditions 
described here. 

There are also other applications in which the role of the time 
dimension is played by some other continuous variable; for example, 
in a process where objects are randomly drawn until their combined 
weight first exceeds 7 pounds and then a sample of these objects is 
withdrawn until the weight of the sample first exceeds ¢ pounds, t < T, 
then (n, + 1)7'/t will be an unbiased predictor of the number NV; + 1 
of objects in the T-pound sample only if the weight of individual ob- 
jects is exponentially distributed. An interesting modification of this 
problem arises when the exact weights of the sample and/or subsample 
are known; that is, when the excess weights above 7’ and ¢ pounds, 
respectively, are also measured. In the analogous fishing-interviewing 
process it might be possible for the interviewer to determine that 
time ¢’, t’ < t, at which the n,'th fish was caught, and use of this type of 
information would certainly modify the form and properties of the 
predictor of total catch. Such problems involving additional interview 
information warrant further investigation. 
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ALLOCATION. OF EXPERIMENTAL UNITS IN 
SOME ELEMENTARY BALANCED DESIGNS 


J.S. 
Research Triangle Institute: 
Durham, North Carolina, U.S.A. 


INTRODUCTION 


In recent years an increased amount of effort has been devoted to 
studies of how best to allocate experimental or sampling units for 
estimating variance components. Concepts which originated with the 
sample survey workers for estimating means and totals are reviewed 
in Cochran’s text [1953]. Crump [1954] and Gaylor [1960] have ex- 
tended these concepts to the estimation of variance components in 
nested and cross-classified designs. 

The usual procedure of allocation is to minimize a set of specified 
variances, often subject to given cost restrictions and a fixed total 
number of units. This can obscure the fact that the fixed number of 
units automatically limits the amount of attention which can be devoted 
to any one component. Also it is superficial to minimize variances 
subject to cost restrictions, if cost is not a major consideration. Achiev- 
ing a predetermined balance of variances of the estimates sometimes is 
more meaningful, but frequently as will be shown, this is not obtainable 
with a balanced allocation. 

The following development illustrates these a for balanced, 
two-level nested allocations and two-way cross-classified allocations of 
normally distributed variables. The assumption of normality usually 
is justified for applications where variance component analysis is im- 
portant, such as in genetic and chemical studies. 


GENERAL DEVELOPMENT 


In the simple two-level, nested analysis of variance, there are n, 
primary (or first level) experimental units, n. secondary (or second level) 
experimental units for each primary unit, and N = n,n, units in all. 
The two components of variance which can be estimated are o” ’ 
within, estimated by the within mean square SSW/n,(n. — 1) = 
and the among o? = ko’ (say), estimated by the difference palate bt 
among and within mean squares, 


1Part of this paper was written while the author was a graduate student at North Carolina State. 
College. 
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[SSA/(m, — 1)] — [SSW/n,(n. — 1)] = 

The circumstances of the populations to be sampled may dictate 
that either (2) m, is predetermined and n, is any integer or (ii) N is 
fixed and (n, , 2) is a pair of integers such that nn, = N. 

(¢) For the assumption of normality, 


2 n(n. + nak)” +(m— 1) 4 


n,(n2 — 1) 


(1) 


Var [67] = (2) 


In Var [62], the variance of SSA/(n, — 1), 


_[ 1 2k ‘| 

dominates for all but very small values of n, and k. For fixed n, , the 
most important term of this variance is k’, which is determined by the 
uncontrolled, physical properties of the population sampled. When k 
is not small, large values of n. can have little effect on Var [é2] while 
they provide a more reliable estimate of o” than is required. Therefore, 
let us investigate how to choose n, so that the relative effort allotted 
to the estimation of 02 and o’ satisfies a predetermined measure of the 
relative importance of these components. 

If 


Var [43] 
Var [ ’ (3) 


R= 


where R will be called the relative precision of estimation, then 


R — + mb)* + (m = 1) 
1) 


(4) 


The expression (4) reduces to a cubic equation in n, , which can be 
solved to indicate the correct choice of n, for a balanced allocation, 
or bounds for the correct solution can be derived in the following manner: 


(a) Approximate R by 


n3(n, — 1) 


(5) 


Subject to n. > 0, (5) reduces to the quadratic equation in n, 
nz(k°) — n,[R’ + k(k — 2)] — (2k — 1) =0 (6) 
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with roots 


= = 2) +V(R + He = 2)F = = 2k) 


The larger root will be considered because only very infrequently does 
the smaller root supply an applicable solution. Using this root, the 
relative precision actually achieved, R, , is 


(b) Approximate R by 


— + nok)? +n 
n(n, — 1) (8) 


Subject to n, > 1, n. > 0, (8) reduces to the quadratic equation 
— n[(m, — + — 2)] —n(2k-1)=0 (9) 
with roots 


n(R”) = — — 2)] 
Miu — DR” + — — — 2k) 
2n,k? 


Again only the larger root will be considered. The precision achieved 
for this allocation, R, , is 


(10) 


<2”. 

R” is the more desirable of the approximations since it utilizes the 
value n, , and it provides a solution only slightly smaller than the 
desired precision. For large values of n, , however, it is readily seen 
from an examination of the interval about the specified ratio, 


that the result from either approximation provides a relative precision 
very close to R. Thus, for large predetermined n, , the solution for n, 
is dependent only on the relative sizes of the variance components to 
be estimated and the relative precision desired for these estimates. 
In fact, n2(R’) and n.(R”) are strictly increasing in the range R’, 
R” > n,k(2 — k)/(n, — 1), which exists for the majority of the situa- 
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tions encountered. Inéreasing n, above only will increase 
and place more emphasis on the estimation of o’ than is desired, par- 
ticularly since o? usually is the more interesting parameter. 

Most importantly, n.(R’) > n,(R’’), so that an immediate check 
is available on whether or not a balanced allocation exists for a desired 
relative precision. If n,(R’) < 2, then a larger relative precision must 
be accepted if a balanced allocation is to be used. From (7) it is seen 
that a real-valued solution of n.(R’) always exists if k > 3. For this 
case, n2(R’) < 2 is certain to occur if R < k(k + 1) +3. Whenk < 3 
and the solution is real, ie. R > k[(2 — k) + 2V/1 — 2 k], no balanced 
allocation exists if simultaneously R < k(k + 1) + 3, a balanced alloca- 
tion is certain to exist if R > k(3k + 2), and between these bounds the 
existence is questionable. Obviously for large values of k, balanced 
allocations exist only when there is a willingness to place much more 
emphasis on the estimation of o” than o? . 

In the case of predetermined n, , n. = 2 probably will be accepted 
when the balanced solution doesn’t exist since large R indicates only 
that a better estimate of o” will be obtained than that which was thought 
necessary. 

(ii) The more important allocation problem occurs when N is 
predetermined, and n, and n, must be chosen, subject to nun. = N, 
so that the desired relative precision R is achieved. 

In (4) the number of primary units n, can be replaced by N/nz , 


and the correct allocation of secondary units satisfies the quadratic 
equation 


n(R + Nk?) — n.N[R + k(k — + — 2k) —1] =0. 
The desired solutions are obtained from 


NIR + k(k — 2)] 


= "oR + NE) 
(R + k(k — 2) — — 2k) — 1)(R+ Nk) 
2(R + Nk’) 
_ NIR+ kk — 2)) 
= N — 2k) — 1] 


+ — 2)" — — 2k) — 1\(R + Nk’) 
— 2k) — 1) 


With this solution, it is easy to check whether a balanced allocation 
exists for the k at hand and the desired R. Consider first the case where 
R +k? — 2k is positive. If k > (N — 1)/2N, n2(R) is obtained from 
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the solution with the positive radical element, ,(/2) from the solution 
with the negative radical element. If k < (N — 1)/2N, two possibly 
acceptable allocations exist by the choice of the positive and negative 
radical elements. If k = (N — 1)/2N, the solution, taken directly 
from (12), is = — (N 1)(8N + + (N — 1)’ 
and »,(R) = N/n(R). If R + k’ — 2k is negative, the result for 
k > (N — 1)/2N holds, but there is no allocation for k < (N — 1)/2N. 

If n,(R) or n2(R) is less than two, intuitively, it seems clear that- 
there will be some staggered allocations which will provide more de- 
sirable estimates of o2 and o° than the minimum balanced allocation 
achieved by setting these values at two. 

The same development can be extended to balanced, rows and 
columns, cross-classified designs. Let r be the number of rows, c be 
the number of columns, o? be the variance attributable to the row 
effects, o? be the variance attributable to the column effects, and 0? 
be the residual variance. Set k, = 02/o% and kz = o°/o? , then 


Var E — 1)(1 + chy)? + 
The proper allocation is obtained from 
—2)R —ko(k2—2)] 


2)R— — 2k, —(1—2k,)] 
—2k.)] 
(16) 


(15) 


r(R) = 


N[ki(ki —2)R — ko(k2 —2)] 
—2k,)R—Nk?] 


2[(1—2k,)R— NE] 


c(R) = 


(17) 
The allocation will depend on 


(1 — 2k,)R — Nk3] and [NKR — (1 — 2k.)], (18) 


because the signs of the radical must be different in the two solutions. 
Three sets of conditions exist which affect the choice of r and c. These 
conditions are functions only of the physical limitations of the estima- 
tion problem (N, k, , and k,) and the desired relative precision of esti- 
mation Let k,(k, — 2)R — — 2) be positive; 

(a) If both elements cf (18) are positive, then there are two solutions 
from (16) and (17) which can lead to acceptable allocations. 
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(b) If the elements of (18) have opposing signs, then there exists one 
solution from (16) and (17) for an acceptable allocation. 

(c) If both elements of (18) are negative, there is no acceptable 
allocation. 


“When &k,(k, — 2)R — ke(k2 — 2) is negative, the results of (a) and (c) 


are interchanged. In each case, there is the hazard that the solutions 
for a desired R will be complex values. 

In the examination for possible allocations of the nested, fixed- 
total design and the cross-classified design, it is easier to check for the 
existence of an allocation by substitutions into (13), (14), (16), and (17) 
of the parameters of a specific problem than to give general bounds 
on R. A rough indication of the existence of an allocation can be 
obtained by replacing k, k, , and k, by estimated values. If data from 
a pilot experiment of the nested type is available, the estimate 


= N’—n, nN’ — — 


where the primes indicate values from the pilot data, is unbiased. The 
estimates of k, and k, have the same form as (19) where ¢2 and ¢” 
are replaced by é”’ or 6!’ and ¢/?, and (r’ — 1) or (c’ — 1) replaces 
(ni — 1), (r’ — 1)(c’ — 1) replaces N’ — n{ , and 1/c’ or 1/r’ replaces 
1/ni . It is the experience of the author that this check indicates a 
balanced allocation only when the ratio of the estimate of k to the 
desired value of-R is small. 

The existence check also provides an estimate of the desired alloca- 
tion if one is indicated. In these estimated allocations, it is unlikely 
that any of the solutions using & or the true k will be an integer. No 
investigation has been made as to what should be done in this case, 
but it seems reasonable that if n.(R’) and n,(R’’) bound an integer, 
use that integer. If they do not, take the next largest integer; at worst, 
for known k, this will give an estimate of o’ which is better than desired. 
For n2(R), r(R), and c(R), take that one of the integers bounding the 
solution which, when substituted into (6), (16) or (17) most closely 
approaches the specified R. Like the existence check, the estimated 
allocations should be regarded only as crude approximations to the 
desired allocations. 


AN EXTENSION FOR THE JOINT ESTIMATION OF 
p SETS OF VARIANCES IN NESTED ALLOCATIONS 


Frequently, as in genetic work, more than one variate is measured 
on each individual experimental’ unit. For p such variates it is desired 
to obtain p sets of estimates of the variances , = p. 
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Since it is often econcmucally unwise to allocate by the variate rather 
than the whole experimental unit, the allocation must be in terms of a 
single measure of relative precision for all p sets of estimates. 

It may be that there is one particularly important variate «;- which 
must provide estimates with a relative precision R, . Thea the allo- 
cation would be decided by using R;. and k,. in the formulae of the 
preceding section. If. however, one is satisfied with an average relative 
precision over all variates measured, let 


Define 


Ske, O- 


=1 


The two approximations for case (i) with fixed n, are 


+ (kh) — 2k) + + — 20], — 4002 2k) 
= 


— DR” + — — — 2k) 


2n,(k?) 
and for case (11), 


5», — VIR + — 2(h)] 


(23 


4 + (8) = 20) = = 20) — LR + NO?) 


2|R + N(k’)] 


In general the features of the single variate case apply here, except 
it should be noted that (k?) > (k)’. 

There are two drawbacks to using R as a measure of relative pre- 
cision. If any &, is exceedingly large, it precludes the existence of a 
balanced allocation, even though singly each of the remaining p — | 
variance sets could be estimated using balanced allocations. The 


R = > R,/p. (20 
| 
Then, 
ola ki 
n(n. — 1)} 1+ > + ni + (n, — 1) 
n3(n, ]) 
7 
(21 
[(n, 1)R”’ + n,(k’) 2n,(k)} 
2n,(k*) 
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average R does not incorporate any information about the relative 
importance of the different sets of estimates. A variate of small interest 
may provide estimates with a low relative precision, while a very 
important variate may provide estimates with a very large relative 
precision. 

This generalization to p sets of variances obviously does not hold 
for the cross-classified designs. 


_ AN EXAMPLE 


In Table 1, an analysis of variance of plant data from a genetic 
experiment with inbred lines is presented. The among lines variance 


TABLE 1 
ANALYSIS OF VARIANCE OF Four INBRED LINES 


source of variation | D.f. 


| MS. 
Among lines | 3 | 19.03 

| 1.02 


Within lines | 36 


o. is the genetic variance among the plants being studied, and the 


within line variance, o”, is variation caused by the environment. .These 
two variance components determine the coefficient of heritability of 
the character being studied; thus both must be estimated. 

‘These data can be used to plan two types of experiment for estimating 
o and oa. In the first experiment, it will be assumed that 200 inbred 
lines are available for testing and ample land is available for setting 
out any reasonably sized experiment. For the second experiment, 
it will be assumed that there is only enough land for a 500 plot experi- 
ment, but that any number of inbred lines are available for planting. 

To check for the existence of balanced allocations, k is estimated using 
formula (19) with N’ = 40, n{ = 4, ni = 10, 62? = (19.03 — 1.02)/10 = 
1.80, and ¢’? = 1.02. The estimate, &, is 1.66. Since & is greater than 
.5, it should be suspected that R will have to be considerably larger 
than one to obtain a balanced allocation. Let any R value between 
one and five be acceptable. 

(i) Experiment 1: A balanced allocation is indicated if R = 
k(k + 1) + .5 = 4.92. The upper acceptable limit of R does provide 
an estimate of a balanced allocation. Substituting R = 5 and k = 1.66 
into (7) or (10) gives an estimated allocation of n, = 200, n, = 2; 
an experiment utilizing 400 plots. 
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(ii) Experiment 2: For this experiment R + F — 2k = 4.44 
when R = 5, and & > (N — 1)/2N; the larger root of (13) is used. 
To the nearest integer values, the estimated allocation is n, = 250, 
n. = 2. 

In either experiment, to achieve a more even distribution of the 
variances of the estimates, an unbalanced allocation would have to 
be used. 
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AUGMENTED DESIGNS WITH ONE-WAY ELIMINATION 
OF HETEROGENEITY‘ 


WALTER T. FEDERER 
Cornell University, Ithaca, New York, U.S.A. 


INTRODUCTION 


One of the principal problems in plant breeding and in biochemical 
research of new pesticides, herbicides, soil fumigants, drugs, etc., is the 
evaluation of the new strain or chemical. Efficient experimental designs 
and efficient screening procedures are necessary in order to make the 
most efficient use of available resources. In some instances sufficient 
material of a new strain or a new chemical is available for only one or 
two observations (plots). Hence, the experimenter should use an 
experimental design and a screening procedure suitable for these con- 
ditions. In other cases, the experimenter may wish to limit his observa- 
tions to a single observation on the new material. In still other cases 
(e.g., in physics), a single observation on new material may be desirable 
because of relatively low variability in the experimental material. 
Furthermore, it may be desired to combine screening experiments on 
new material and preliminary testing experiments on promising material. 
The experimental design should be selected to meet the requirements 
of such experiments rather than selecting the material and experiments 
to meet the requirements of the experimental design. The experi- 
mental designs described in the present paper were developed to satisfy 
requirements such as those described above. 

The class of experimental designs known as augmented designs was 
introduced by the author in 1955 to fill a need arising in screening new 
strains of sugar cane and soil fumigants used in growing pineapples’ 
(Federer [1956a, 1956b, 1956c, 1958]). An augmented experimental 
design is any standard design augmented with additional treatments 
in the complete block, the incomplete block, the row, the column, etc. 


iPaper No. BU-39 of the Biometrics Unit and Paper No. 341 of the Department of Plant Breeding. 

2G. N. Rao, Sugar Cane Research Station, Anakapalle, Andra Pradesh, India, has used some of 
these designs for field lay-outs of sugar cane (personal communication, 7/19/57). J. G. Darroch, 
Experiment Station, Hawaiian Sugar Planters’ Association, Honolulu, Hawaii, expected that eight 
experiments designed as ted ized pl block designs would be harvested in 1957 
and several more had been installed (personal communication, 8/23/57). 
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The construction and randomization procedures will become apparent 
after consideration of u few specific examples. Analyses for some of 
these designs have appeared in the literature (Federer [1956a, 195tc!). 
The purpose of this paper is to present the general approach for all 
augmented designs with one-way elimination of heterogeneity and to 
present examples of two specific augmented designs, the augmented 
randomized complete block design and the augmented balanced lattice 
design. The general approach for designs with two- and higher-way © 
elimination of heterogeneity will be presented in a forthcoming paper. 

I:xamples of specific designs with additional treatments, unequal 
replication on treatments, or unequal block sizes have appeared in 
statistical literature (e.g., Basson, [1959]; Corsten, [1959]; Das, [1958]; 
Graybill and Pruitt, [1958]; Justensen and Keuls, [1958]; Kishen, [1941): 
McIntyre, [1958]; Pearce, [1948]; Yates, [1936b]; Youden and Connor, 
{1953]). However, a general approach covering the class of augmented 
designs as well as the others has not been presented. This is done 
in the present paper. 

The randomized complete block design and the one-restrictiona! 
lattice designs are well known examples of experimental designs with 
one-way elimination of heterogeneity. General methods of analyses 
have been developed for incomplete block designs and for non-orthogonal 
situations (e.g., Bose and Nair, [1939]; Federer, [1957]; Kempthorne, 
{1952}; Nair, [1941]; Rao, [1947]; Yates, [1934], [1936a], [1936b], [1938}). 
Analyses for augmented designs with one-way elimination of hetero- 
geneity are developed along similar lines. 


CONSTRUCTION AND RANDOMIZATION 


The construction of augmented designs with one-way elimination 
of heterogeneity is illustrated with examples. Consider first the aug- 
mented randomized complete block design. Here there are NV, 
plots (experimental units) in each of the j = 1, --- , r blocks, with 
the \’; not necessarily equal; there are two kinds of treatments, treat- 
ments repeated r times and occurring once in every block and treat- 
ments repeated less than r times (the treatments could appear more 
than r times in the experiment and the analyses still hold) and hence 
occurring in only a portion of the blocks. lor a. large number of situa- 
tions « number v, of treatments will occur once in each of the r blocks 
and a number vr, of treatments will occur once in one of the r blocks; 
for convenience call the former group standards or standard treatments 
and the latter new treatments. ‘The schematic treatment layout for 
r = 5blocks, r, = 4 standards (A, B,C, D), andv, = 13 new treatments 
(fg, hy dom. o, p,q) is: 
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Block number 
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where the N,;(=6 or 7) were made as nearly equal as possible although 
this is not necessarily the best grouping for all experimental situations. 

The randomization procedure for the augmented randomized com- 
plete block design is: 


(i) Randomly allot the v, standards to v, of the VN; = n., plots 
in each block. 

(ii) Randomly allot the v, new treatments to the remaining plots, 
which is equivalent to randomly assigning the lower case letters 
to the new treatments and assigning the letters in order to the 
remaining plots. 

(iii) If a new treatment appears more than once, assign the different 
entries of the treatment to a complete block at random with 
the proviso that no treatment occurs more than once in a com- 
plete block until that treatment occurs once in each of the 
complete blocks. 


The augmented triple lattice design with v,, = v, = 9 standard 
treatments (capital letters) and v, = 15 new treatments (lower case 
letters) is used to illustrate the construction of augmented incomplete 
block designs. The schematic lay-out for the treatments is: 
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fhe same procedure may be used to obtain other incomplete block 
designs. For example, a fourth group of treatments which would make 
the above an augmented balanced lattice design, is. (for v, = 18): 


Replicate 4 
Incomplete block no. 


| 
| 
| 
| 
g — 


1 2 3 
A Cc 
D E 

H G 

a 

4 4 4 


The randomization procedure for any augmented incomplete block 
design for v, standards, repeated r times, and v — v, new treatments in 
incomplete blocks of size n_ ;, is: 


(i) Randomly allot the groups to the incomplete blocks within 
each replicate. 

(ii) Randomly allot the standards within each group to the nj, 
plots of block jh. 

(iii) Randomly assign the new treatments to the remaining plots. 

(iv) New treatments appearing more than once in the experiment 
should be randomly allotted to the incomplete blocks with the 
provisos that the treatment should not appear twice in any 
one replicate until it has appeared once in all replicates and the 
treatment should not appear twice in one incomplete block 
until it has occurred once in all incomplete blocks. 


Other incomplete block designs (Jederer, [1955], chapters XI and 
XIII; Kempthorne, [1952], chapters 22, 25, 26) may be used to set 
up additional augmented incomplete block designs. The procedure 
is as described above for augmented triple and balanced lattice designs. 


ANALYSES 


The generalized form of the analysis of variance for an augmented 
design with one-way elimination of heterogeneity is presented in Table 1; 
the algebraic developments and the notational definitions for this table 
are presented in the section entitled Appendix. (This development 
logically appears in this section, but was relegated to an appendix 
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because of algebraic complexities.) The intrablock treatment means 
and variances and the interblock treatment means and variances are 
presented in matrix form. Likewise, the expected values of the mean 
squares E, , FE, and £; from Table 1 are presented in general form. 
Utilizing these results, the analysis of variance, adjusted treatment 
means, and variances for mean differences are presented for the aug- 
mented randomized complete block design and for the resolvable 
augmented balanced lattice design in the following two sections. 
Analyses for other incomplete block designs may be obtained from the 
general results given in the Appendix. 


TABLE 1 
ANALYSIS OF VARIANCE FOR EXPERIMENTAL DESIGNS 
WITH ONE-Way ELIMINATION OF HETEROGENEITY. 
Mean 
Source of variation df Sum of squares squares 
72 
Correction for mean = CF 1 
Among incomplete blocks y? y? 
(ignoring treatments b-1 — 
jh 
and complete blocks) 


Among treatments (eliminating 
complete and incomplete v— = #:Qi.. 
block effects ) ‘ 


Intrablock error n.. —b —v +1 subtraction E, 


Complete blocks (eliminating 


treatments, ignoring r—1 — 
incomplete blocks 
P ) 
Incomplete blocks (eliminating h-r +  E, 
treatments and complete ii * 
Among complete and incomplete sp. 
blocks (eliminating treatments) b-1 + in ES 


AUGMENTED RANDOMIZED COMPLETE BLOCK DESIGN 


lor the augmented randomized complete block design, it is simpler 
algebraically to look at the normal equations for the effects from the 
intrablock analysis rather than to consider the v formulae given by (3). 
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The discussion here refers to new treatments appearing only once in the 
experiment. If the new treatments appear more than once, it may be 
simpler to obtain solutions from equation (4). The normal equations are: 


Ain Dnt + 


i=l 


(a + + > = 


Standard treatments (in r blocks): 


r(a + + p; 


i=l 


New treatments (in one of the r blocks): 


Vis, 
where® 


i=] 


Solution of these equations for effects results in the following (Federer, 
[1956a] and [1956c]): 


1 


7 + = Y,,./7 = 


1 


t=] k=1 


(N,; — v, = n,; = number of new treatments in block j) 
A+ = — = Vii; - 
Thus, the new treatment means need to be adjusted for the block in 
which they appear. 
The various estimated variances between treatment means are: 
Two standards: | 
VQ... — Je.) = 2E./r (i # k). 

Two new treatments in the same block: 

— Yiu) = 2E, (t ¥ k). 
Two new treatments in different blocks: 


3A somewhat simpler solution could have been obtained by taking >> fr; = O rather thaa 

= 0. 


r 
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A standard and a new treatment: 
| ] ] 
E.\1 + + be. (i # hk), 


where E, is the error mean square obtained from an analysis of variance 
on the standards alone. 

The following numerical example was constructed for ease of com- 
putation (r = 3,v, = 3.v, = 2): 


Replicate number ' 


1 2 3 
Yau = 9 = 6 = 12 j 
| "aun = 5 Yin = 6 Y323 = 10 
You = 7 = 6 Yas = 
Yiu = 13 = 10 
Total for 
Standards Ys, = 21 = 18 | =.72 
Standard Totals Ya, = 27 = 24 


Applying the formulae obtained above we find that @ = 10, 
7, = = —1, 72 = fae = —3, 73 = = —2, 7%, = = 4, 
= 712 = 2,1 = —l, = —2, = 3. 


In the analysis of variance table we obtain: 


Source of variation | df 33 ms 
| 

Total (corrected for mean) 10 76 5454 — 

Biocks (ignoring treatment ) 2 27 5454 

Treatments (eliminating blocks) 4 45 25 

Error (elim. tr. and bl.) 4 4 1 

Block (eliminating treatment) 2 42 2 

Treatment (ignoring blocks) 4 30.5454 _ 
The error sum of squares equals 4 as it should since ¢,, = +1° 
60) = —1, 6:2 = —1, and €. = +I was used in constructing the 


‘The error (eliminating treatment and block) and the block (eliminating treatment) sums of 
squares may be obtained from the analysis of variance on the yields for the standards. 


| 
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example and 1° + (—1)? + (—1)? + 1’.= 4. Also, from the formulae 
for variances, . 


— Gx.) = 2(1)/3 = 2/3, 
V(Via — = 2()0 + 1/3) = 8/8,. 
— = + 1/3 + 1/3 — 1/9) = 14/9. 
Likewise, it is possible to partition the treatment (eliminating block) 


sum of squares with v, + v, — 1 degrees of freedom into the following — 


orthogonal contrasts:” 
Among standards (v, — 1 d.f.) 


Among new treatments within a block (>> n;; — rd.f.): 


> + (> /m,} = 0. 


a=) 


Standards vs. new treatments in block j (r d.f.): 

r vr 2 
i +=] 


(21 — 3(13))?_, — 3(10))’ 
+ 1) 3(1)(3 + 1) 


Thus, 6 + 0 + 39 = 45 = treatment (eliminating block) sum of squares. 


+ + 0 = 39. 


AUGMENTED BALANCED LATTICE DESIGN 


The balanced lattice design, or its equivalent, the balanced in- 
complete block design, is described in various places (e.g., Federer, 
1955], sections XI-3.3, XI-4, and XIII-2.1; Kempthorne, [1952], 
Chapters 22, 23, and 26). By including additional treatments in the 
incomplete blocks, an augmented balanced lattice design (ABLD) is 


formed. The simplest ABLD is the one formed by including new © 


treatments only once in the experiment. If some of the new treatments 
appear in more than one incomplete block, the computations become 
more complex, but the general results given in the Appendix apply. 

For the ABLD, it appears that the restriction )>*., 7; = 0 results 
in a simpler solution than the restriction + = 
: a #, = 0 (the first v, treatments are standards and the remaining v, 
treatments are new ones). Using the #, = 0 as the restriction, 
it must be remembered that the differences between treatment means 
are unbiased but that the treatment means themselves are biased 
considering all v treatments as fixed effects with 


‘Suggested by a referee. 
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5 


where u;.. equals true treatment mean and wp = =. u,../t. In most 


situations, interest lies in differences between means rather than in the 
means themselves. 


Intrablock analysis: 


The intrablock analysis of the ABLD with the standards in a 
valanced incomplete block design for r, = k*, r = k + 1, and 

= k(k + 1) incomplete blocks of size /, is relatively simple com- 
nitalianilie for the standards in k + 1 replicates and the new treat- 
ments included once. Using the v equations obtained from formula (3) 
add 1/k times the sum of the normal equations for the new treatments 
which appear in an incomplete block with the ith standard, to the 
normal equation for the 7th standard. Performing the same operations 
on the normal equations for all standards results in a set of equations 
involving only the effects and the yields associated with the standards. 
The resulting equation for the 7th standard treatment after using the 
equation >>**, #, = 0, is 


2=1 h=l 

where 9,.;, equals jhth incomplete block mean on standards only and 
\ = 1 = number of times any two standards appear together in an 
incomplete block in this particular ABLD. With estimated treatment 
eflects for the standards it is possible to solve for the remaining 
— v, = v, for the new treatments simply by substituting the 7, 
for the standards and solving the n.;, — / equations in the jhth block. 

Since 6; = 9..;. — 9, (where g,.;. = jth complete block mean and 
j, = overall mean obtained on the yields of the standard treatments 
alone), the ” may be obtained from formula (8) reduced as follows: 


k+l ok 


h=l 


With the additional equation be _, By. = 0, we find that the solutions 
are: 


which is the usual solution for a resolvable balanced lattice design with 
the parameters r = k*, k = block size, r = k + 1,b = k(k + 1) and 


’ 
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} = 1. The above solutions are used in the computation of the sums 
of squares in the analysis of variance. 

The treatment mean adjusted for complete block and for incomplete 
block effects is if + 7, and the variance of a difference between two 
adjusted means in this ABLD is one of the following: 


\ (difference between intrablock means of two standards 
= — 7,) = 2or/k, 
V (difference between a standard and new treatment occurring 
in the same incomplete block) 
= 6° (1/k + +k + 1)/k? — 2/k*) = of + 2k — 
V (difference between a standard and new treatment not occurr- 
ing in the same incomplete block) 
= o°(1/k + (k? + k + 1)/k’) = o%F(k + 
V (difference between two new treatments in the same incom- 
plete block) 
= + k + — 2(k + 1)/k} = 20%, 
V (difference between two new treatments in the same replicate 
but not in the same incomplete block) 
= + k + 1)/k’, 
V (difference between two new treatments not in the same 
replicate) 
= 6 +k + — = +h +k — 1)/F. 


The n”’ in formulae (4) to (6) may be obtained from the above variances. 
The intrablock mean square E, is an estimate of o% . Hence, the esti- 
mated variances are obtained simply by substituting . for o% in the 
above six variances. 

The following numerical example was selected for ease of computa- 
tion. The non-randomized layout is presented below along with the 
computations: 


Replicate I Replicate II Replicate IIT 
Ya:=14 Yin =11 Yon =8 Yin=7 Ym=10 
You =5 Yar =13 Yeu =10 Yen =15 Yas =11 Yoo =12 

Yon =7 Ym=9 

Yo =9 
Totale ¥s,052 |¥.n<21 ¥.n=32 ¥2,=53 =18 =22 =40 
Totals 

standards =9 Ysi2=27 Yau, =36 | =23 = 44) =22 Vas, =40 


i 
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Creatment totals Y,.. = 22, Y,., = 23, Ys.. = 40, Y,.. = 35, Y;.. = 7, 
“s.. = 9, Y;.. = 9. Grand total = Y... = 145; grand total on stand- 
‘ds = Y,.. = 120. 
Q,.. — 45/12, Q... = — 59/12, Q; 82/12, 
= -20/12, Q@nu=-4, Q2=4, 
= 0, 22 = 0, Q's; = —8/3, 
Q's2 = 8/3. 
a=120/12= 10, 4, = (36/4) -10 = —1, 
= (44/4) -10=1, = (40/4) — 10 = O. 
| 20 0 14 14 © 
| | 0 1/72 0 O 1/4 1/4 1/4]] —59/12 
0 12 0 O 82/12 +3 
0 0 12 0 0 0 |=] Ol. 
*s 1/4 1/4 0 O 7/4 3/4 1/8 9/12 1 
te 1/4 1/4 0 0 3/4 7/4 1/8] 33/12 3 
ia} LO 1/4 1/4 O 1/8 1/8 7/4]_—20/12 
[8/4 0 0 0 0 OF -4 
|0 3/4 0 0 0 4 
10 0 3/4 0 0 Of o |_ 
Baa 0 0 3/4 0 O | 
Bas | ~8/3| |-2 
0 0 0 o s3i 2 


Using the above, the sums of squares in the analysis of variance 
i ble for the standard treatments are computed in the usual manner 
ee Federer, [1955], page 342) and are: 


0 
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| 

Source of variation df ss ms 
Total 12 | 1330 
CFM 1 1200 
Replicates 2 8 
Standards (ign. blocks) 3 238 3 — 
Blocks (elim. treatment) 3 104/3 104/9 = E, 
Intrablock error | 3 8 8/3 = E, 
The analysis of variance on all treatments is: 

Source of variation df ss ms 

Total 15 1541 — 
CFM 1 4205/3 _ 
Replicates (ign. tr. and bl.) “4 54/5 = 10.8 — 
Blocks (elim. reps; ign. tr.) 3 4447 /60 — 
Treatments (elim. bl. and reps) 6 557 /12 537/72 
Intrablock error 3 8 8/3 = E, 
Blocks (elim. tr. and reps) 3 104/3 104/9 = E, 


The various variances of a mean difference between two adjusted 
treatment means are: 


Variance of difference between two standards 
2E./k = 8/3. 


Variance of difference between standard and new treatment in same 
incomplete block 


= 3 4 14 


Variance of difference between standard and new treatment not in same 
incomplete blocl: 
- §(3)' - 
k "> 


Variance of difference between two new treatments in same incomplete 
block 


2E, = 16/3. 


| 
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Variance of difference between two new treatments in same replicate but 
different incomplete blocks 


Not applicable for this example. 


Variance of difference between two new treatments in different replicates 
3 2 
2E,(! + k + k (8)(8 +4 + 2 = 26/3. 
Interblock analysis: 

The amount of intrablock information is w = 1/E, = 3. In order 
to obtain the amount of intrablock information, we first need to obtain 
the expectation of EZ, from formula (13). After substituting in the 
various values, a rather surprising result is obtained in that E, in the 
ABLD with new treatments occurring only once has the same expecta- 
tion, 2 + k’o5/(k + 1), as for the standard balanced lattice design. 
Therefore, w’ = k/[(k + LE, — E.] = 2/(3(104/9) — 8/3] = 1/16. 
With these weights, we now proceed to the computation of the n,, 
and Z;.. from formulae (21) and (23).° Thus r* = —1.440, r$ = 
—2.132, r§ = 3.132, r§ = .440, r= = .143, 7§ = 2.143, r# = —2.000. 
The variances of a difference between any two 7* may be obtained from 
formula (24). Thus, the variance of 


998 1297 ( =5) 
= + — — — = 2.446. 
— = 819 * 1071 819 446 


SUMMARY 


An augmented experimental design is any standard experimental 
design to which additional treatments (new treatments) have been 
added to those (standard treatments) appearing in the standard ex- 
perimental design. The additional treatments require enlargement of 
the complete block, the incomplete block, row, column, etc. The 
groupings in an augmented design may be of unequal size. The con- 
struction and randomization procedures, and the general method of 
analysis have been given for all augmented experimental designs with 
one-way elimination of heterogeneity from the experimental area. 
The general results are illustrated algebraically and arithmetically with 
two examples, an augmented randomized complete block design and an 
augmented balanced lattice design. Analyses with and without re- 
covery of interblock information are considered. Some discussion of 
unequal incomplete block sizes is given. _ 


6The suggestions leading up to equation (26) were not followed in the computations; the above 
yields approximately the same results for this example. From equation (26) w’ = 9(53)/(53(104)- 
23(24)) = .096. 
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APPENDIX 

The generalized analysis of variance is developed below for all designs 
ith one-way elimination of heterogeneity. Such designs as the random- 
zed complete block and incomplete block designs which are resolvable 
“he v treatments occur together in a complete block) and which are 
on-resolvable (v treatments occur in b incomplete blocks of size k, 
‘or k < v), are considered. 

To be completely general, let the ith one of the v treatments be 
replicated r; times in the b incomplete blocks of size Nin - Let the 
veld of the zjhth observation be expressed by 


= + + Pi + Bin + (1) 

here? = 1, --- ,v = number of treatments; j = 1, --- , 7 = number 
' complete blocks; h = 1, --+ , k; = number of incomplete {blocks 
» the jth complete block; n;;, = 1 if 7th treatment occurs in the Ath 
complete block of the jth complete block and zero otherwise;’ .;, = 
number of treatments in hth incomplete block of the jth complete 
block; n,;, = v; = number of treatments in the jth complete block; 
n;,. = 7; = number of replicates of the 7th treatment; 


r r ky 
= = 3 = ett +9; t+ Ba; 


j=1 A=1 


= a general mean effect = ‘> Ki. = 


t=1 


where k; = number of incomplete blocks in the jth complete block; 
7, = wi. — w = a treatment effect; p; = u.;, — » = a complete block 
effect; Bj, = win — M3. = an incomplete block effect; and e¢;;, are 
random independent effects with mean zero and common variance o% . 
These definitions imply t= = Bia = 0. Other 
definitions for the effects are permissible. 


Intrablock Analysis 


The least squares estimates of effects for the above linear model 
(intrablock analysis) are obtained by minimizing the residual sum of 
squares: 


t=) j=1 hel 


7To be completely general, ij, could be the number of times treatment i occurs in block jh 
instead of 1 or zero, but this was not done here. 


= 
3 
q 
i 
5, 
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Equating to zero each of the partial derivatives of the above residual 
sum of squares with respect to nu, 7, , p; , and 8,, results in the following 
normal equations: 


= 


= ith treatment total, 
ky 


jth complete block total, 


Bin + B; + By) + = Vin 


= jhth incomplete block total. 


In the 7, equation, substitute for a + f; + 8, from the 8;, equations 
to obtain: 


When n,.. = 7 n.;, = k the above equation becomes: 


where >>; Dox %oja%ijn = Aoi = number of times the gth treatment 
occurs with the 7th treatment in all incomplete blocks. For balanced 
lattices \,; = \ a constant for i+ g and the solution for 7, is (for >> #; = 
zero): 


kQ,../(kr kQ,..(v 1) /ro(k 


as given by Federer ({1955], formula XIII-2, where Q.;; = Q,..). 
For the general case we can add d, >. #; = 0, where d, ¥ 0 for at least 
one value of g, to each of the v equations in the 7; ; adding d, >> #; to 
each of the equations in (3) results in: 


We have v equations and » unknowns and the problem is to solve the 


ar 
= 
a 
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equations. In matrix notation the v + 1 equations from (3) plus the 
equation >> 7, = 0 is: 


N jh Nir Nin 

Nish 


eee oe 


jh 
N jh 


l 1 1 0 | 
PQs... = Y,.- > 


Again in matrix notation the solution for the 7; are obtained as: 


where n’’ are the elements of the inverse matrix. The solution for 


7, 1s 
(4) 
i=l 
the variance of 7, is 
= , (5) 
and the variance of a difference between 7, and 7, is (Nair, [1941}): 
— #,) = +n? — — (6) 


In the p; equations, substitute for @ + 7, from 7, equation, thus for j = f: 
Nis. Nis. E 4 A 


= Ni Gi.. = (7) 


tig 
. | 
ane 
| 
| 
T, - 
| 
4 
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The solutions for the p; and 8;, must. be obtained jointly since they are 
not orthogonal and since three sets of unknowns are present. Equations 
involving the 7; only are possible here because the incomplete blocks 
in the complete blocks can be considered as b incomplete blocks. For 
the solution we proceed to the 8;, equations and substitute for @ + 7, 
to obtain (for j = f andh = e): 


+ Br) + 8) 


From these b equations solutions for p; + 8;, are obtained. Summing 
over h, solutions for the /; are ¢ obtained since 7, By = 0. If b is 


(8) 


less than v, then solve for p; + 8;, ; if not, solve for the 7, . Of course, 
as a check one could obtain solutions for both sets of effects and the 
results must check by ae the normal equations. Also, 


“UF HEL 


j=1 j=1 A=l jk 


v 


(9) 


The p* in Table 1 are obtained from the r equations 
pyn.s, — py = — = (10) 
plus the equation }>7_, pt = 0. These equations are obtaiucd from 
the normal equations for u, 7; , and p; setting each B;, = 0. 
The expected value of FE, iso? . The expected value of 
r ky 


(where k’™ ‘* are the elements of the inverse matrix in the solution of 


the p; + 8;, and where p; and @;, are random independent effects with 
mean zero and variances o% and a3 , respectively) is: 


A=1 f=1 g=1 


= AD 3 —2n 


j=1 h= i=1 N;.. 
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j=1 g=1 


Na.. e=1 c=l 


k r 
ihfo NasgNaih Ni 


Na.. e=1 c=l 
o(b — 1) = Kyo, + K.o3 + (b — 


The expected value of >>;_, »*Q.;. is [where a’ are the elements of 
‘ve inverse matrix obtained in the solution of the p* from formula (10)]: 


i=1 


i=1 t=1 


r v 
= a'i(n’,. — 2n.;. + > na) 


Nasr -Naj- 


d=1 Nae 
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+ — 1) = + + (r — Ie. 


Henderson [1953] has obtained expected values for sums of squares 
from non-orthogonal classifications for different situations. 


Now, (9; + Bin)Q.in — 205-1 is the sum of squares 
for incomplete blocks within complete blocks eliminating treatment 
effects and has the expectation: 


— + (b . (13) 
By definition the coefficient of o? must be zero; hence, K, = Ks; ; no 
proof of the equality is given here (see Yates, [1938]). It should be 
possible to simplify the coefficient for oj since there is a relationship 
between the a’ and the k’’”’. Perhaps this should be done prior to 
programming for high speed computers. 

The treatment mean adjusted for incomplete and complete block 
efi: cts is 2 + 7,.. Only intrablock information is utilized in obtaining 
the adjusted means. The variances of adjusted means and for differences 
between adjusted means are given above (formulae (5) and (6)). 


Recovery of Intrablock Information 


The sum of squares to be minimized is 


j=1 A=1 


A=1 


(14) 


where the true weights are w = 1/o% and wi, = 1/(o% + n.,,05) for 
8;, independently distributed with mean zero and variance os and 
where the estimated weights are“ w = 1/62 and wi, = 1/(62 + 1. 
Instead of using a different weight for each incomplete block, an average 
coefficient is utilized and is given below. The estimated weight, w/, , 
should be utilized if there is sizeable disparity among the n.;,. The 
resulting normal equations for w/, = w’ are: 


®[t is assumed that variable n_ ja has no effect on the intrablock variance, (see Finney [1956]). 


I 
4 
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(w + + + 


t=1 


+ w > Yn (w + w')Y... (15) 


Awl 


or 


i 
» 


, 


r ° ky 
+ (w + w’) + p;)) + w MoinBin (16) 
r kj 


pi + p;) + 


kj 


or 
+ p; + By) + = (18) 


Substituting for B;, from equation (18) in equations (15) and (17) 
results in 


win. 2.5.0; + =Y!Y..., (19) 
(a + p;) + = (20) 


sulstituting for B;, from (18) and for » and p, from (19) and (20) we 
obtain: 


A=1 


hel 


= 
r 
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The above » equations plus an additional equation, e.g., )> rt = 0, 
results in unique solutions for the r* ; thus 


where the original equations were in the form: 


Meo Ny 1 Z,.. 


and where the c™ are the elements of the inverse of the matrix of co- 
efficients for the r* . 
The variances of a difference between two adjusted means recovering 
interblock information, say u* + r* and u* + 7%, is 
V(rt — 74) =e +07 —c?® — (24) 
and the variance of r* is: 
V(rt) = (25) 
It should be noted that o% and o% appear in the expected values 
of the Z;.. . 
Now let us return to the calculation of the weight w’ = 1/(é% + 76). 
’ is determined from a nested classification in which there are no treat- 
ment effects (Federer, [1955], page 106). In the present notation the 
expected value of the mean square for among incomplete blocks within 
complete blocks in the absence of treatment effects is: 


and hence 


where b = k; 
The weight w’ for the jhth incomplete block will be (for individual block 
weights) 


wi, = + ind) - 


a | 
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The expected value of the mean square, FE, , for incomplete blocks 
within complete blocks eliminating treatment and complete block 
effects is: é 


ot + — K,)/(b — 1) = + 
The average amount of interblock information is estimated as follows: 


| = m/[AE, — — (26) 
and the intrablock information is estimated as 
w= I1/E,. 


With these weights it is now possible to obtain solutions for the 7% 
in (22). 
Incomplete Blocks Not Arranged in Complete Blocks 


The previous results may be used directly for an incomplete block 
design for which the b incomplete blocks are completely. randomized. 
To apply the formulae set p; = 0,7 = 0,k; = b,n,; = Oorl, andr = 0. 
The equations then become: 


b 
= + 7) + Y= + By) + Mints 


b 
— > 28s =Y,- rishi.» 


hol 
b n 
gh 
— D nati = Dd 
h=l 


and the expected value for blocks eliminating treatment effects sum ‘of 
squares becomes: 


b b 
(b + AD — } + > > min.) 
h 


NasNaj 
fel d=1 1%, 


hy f 


= (b — + (b — . 


Utilizing these results, the analysis goes through in much the same 
manner as for the experimentat design im which the incomplete blocks 
are arranged in complete blocks. 
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If the information from complete blocks as well as from incomplete 
blocks within | complete blocks is utilized, the sum of squares is 


(0; + and its expectation is obtained from equation 
(11) which reduces to (27) above. If the differencés among complete 
blocks are random effects, perhaps the weight w’ should be computed 
from Ej instead of F, in the analysis of variance table. 
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PHENOTYPIC, GENETIC AND ENVIRONMENTAL 
CORRELATIONS 


S. R. SEARLE 
N. Z. Dairy Board, Wellington, New Zealand. 


INTRODUCTION 


A phenotypic correlation is the correlation between records of two 
traits on the same animal and is usually estimated by the product- 
moment correlation statistic. The genetic correlation, on the other 
hand, is the correlation between an animal’s genetic value for one 
trait and the same animal's genetic value for the other trait, estimators 
for which have been proposed by Hazel [1943]. Estimates of these 
correlations are widespread throughout the literature of animal breeding 
and in many instances the estimate of a phenotypic correlation is 
reported smaller in magnitude than that of the corresponding genetic 
correlation, e.g. with certain poultry records, in Lerner & Cruden 
[1948], sheep records in Morley [1951] and with certain dairy records in 
VanVleck [1960] and Searle [1961]. Such results may seem a little 
unexpected at first sight since phenotype includes genotype and one 
might anticipate the correlation between phenotypes to be larger than 
that between genotypes. When estimates have not followed this pattern 
the explanation is sometimes given that a phenotypic correlation less 
than a genetic correlation is the result of a negative environmental 
correlation in the records of the two traits. This paper investigates 
the relationship between these three correlations on the basis of a 
linear model, and demonstrates the situations in which this explanation 
is correct. Other comparisons are also made. 


LINEAR MODELS 


Suppose the records of two traits in an animal are z and X. 
Neglecting the general means, we will take each variable as being the 
sum of a genetic term and an environmental (including error) term, i.e. 
and ore, 

X=G+E. 


The genetic correlation r is the correlation between g and G; that be- 
tween x and X is the phenotypic correlation, R say, and we will define - 


(1) 
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r’ as the environmental correlation, namely that bnotemonee e and E, 
Thus we have 


ll 


cov (9, , 


cov (x, X)/c,0x , 


and 


r’ = cov (, 

The covariance between g and G is denoted by cov (g, @) and their 
variances are o; and o¢ respectively, with similar notation for the other 
terms. The phenotypic correlation R is that between xz and X which 
can be obtained directly from (1) as 


cov (g, G) + cov (g, E) + cov (G, e) + cov (e, FE) 
Vc? + 2 cov (g,e) + + 2 cov G, E) + 


Assuming all genetic-environment covariances (i.e. interactions) are 
zero this becomes 


R= 


cov (g, G) + cov (e, E) 


This reduces to 
R=r VhH +r’ — — A) (3) 


where h and H are the heritabilities in the narrow sense, of the two 
traits, defined as o7/(; + o2) and o4/(og + og) respectively. This 
relationship between the three correlations, phenotypic, genetic and 
environmental, is derived in Lerner [1950] using the method of path 
coefficients. In both cases genetic-environmental interactions have 
been assumed zero, as is customary in discussions of this nature. 


ENVIRONMENTAL CORRELATION 


Equation (3) can be re-arranged to give the environmental correlation 
as 


=(R-—-r VhH)/V(1 — h)(1 — H). 4) 


In terms of the model (1) this is the correlation between e and E, which 
include the random errors. Assuming the correlation between these is 
zero, r’ can be thought of as the environmental correlation. Phenotypic 
correlations are usually estimated directly whereas genetic correlations 
are derived from covariance analyses-between relatives and contain 
additive genetic variation only. Environmental correlations estimated 


+ 
ae 
og 
| 
| 


476 BIOMETRICS, SEPTEMBER 1961 


from this formula may therefore contain genetic elements over and 
above additive genetic variation. 

Consideration of (4) shows that when R and r have the same sign 
r’ will be negative if r is greater than R/~/hH. When r and R are of 
opposite sign r’ has the same sign as R. Thus the phenotypic and 
genetic correlations being of opposite sign (an infrequent occurrence 
one would imagine) implies that the phenotypic and environmental 
correlations have the same sign. 

Genetic and phenotypic correlations of similar sign is the usual 
situation, and in this case we see-that the ratio of the phenotypic to 
the genetic correlation has to be less than the geometric mean of the 
heritabilities, before the phenotypic correlation being less than the 
genetic correlation implies a negative environmental correlation. 
Values of this geometric mean are given in Table 1. 


TABLE 1 
Va.ugs oF A = 
Heritability of second trait 
Heritability 
| 10 
2 14.20 
3 17 .24 .30 
.4 .20 .28 .35 .40 
6 .24 .35 .42 .49 .55 .60 
.26 .37 .46 .59 .65 .70 
8 .40 .49 .57 .69 .75 .80 
9 .30 .42 .52 .67 .73° .79 .85 .90 
1.0 32.45 .55 .63 .71  .77 .£84 .89 .95 1.00 


1The environmental correlation is negative when the ratio of the phenotypic correlation to the 
venetic correlation is less than A, 


This shows that for traits with low heritabilities (as is the case with 
many traits of economic importance in farm animals) the ratio 
phenotypic correlation/genetic correlation 

has also to be low before a negative environmental correlation is im- 
plied. lor example with heritabilities 0.5 and 0.3 this ratio must be 
less than 0.39, i.e. the genetic correlation must be more than two-and- 
a-half times as great as the phenotypic correlation for the environmental 
correlation to be negative. 
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A phenotypic correlation less than its genetic counterpart, together 
with a small positive environmental correlation, will occur where the 
genes governing two traits are similar but where the environments 
pertaining to the expression of these traits have a low correlation. 
For example, the genes controlling milk production in the first month of 
a cow’s lactation and those controlling total lactation yield may be 
quite highly correlated; but the environments pertaining to the first 
month and to the lactation yields may have a low correlation. The 
phenotypic correlation would then be less than the genetic correlation. 
This is observed in estimates reported by Searle [1961] and VanVleck 
[1960] and also by Lerner and Cruden [1948] for egg production. The 
situation of an environmental correlation having a small value is by no 
means universal and in many instances its value will be large (positive 
or negative) because the environment generally affects an individual 
in all its parts and functions. 

The equation (3) for the phenotypic correlation R can be written as 


R = Ar + Br’ (5) 


where A = VhAH and B = V/(1 —h)(1 — H). Because A <1 a 
negative value of r’ implies R being less than r. Thus a negative en- 
vironmental correlation always implies the phenotypic correlation being 
less than the genetic correlation, although it can be less without the 
environmental correlation necessarily being negative. 

The equation for R when the heritabilities are equal is 


R = hr + (1 — Ajr’ 


from which it is seen that if any two of R, r and r’ are equal the third 
is also. Thus equal heritabilities imply that equality of any two of 
the correlations is tantamount to equality of all three. 


PHENOTYPIC AND GENETIC CORRELATIONS. 


The relationship of R and r’ to r can be discussed in terms of equation 
(5), first noting that A + B < 1 because A < 3(h + H) and 
B < 34(2 —h+ H) arising from a property of geometric and arithmetic 
means. The phenotypic correlation R, can only exceed the genetic, r, 
when the environmental correlation r’ also exceeds r, and sufficiently 
so; in fact r’ must be greater than r(1 — A)/B. Thus R 2 r according 
as r’/r 2 (1 — A)/B, and since (1 — A)/B > 1 there is a small region 
where R < r but r’ > 1; otherwise R and r’ are less than r together. 
Values of (1 — A)/B are shown in Table 2. An example is for heritabili- 
ties of 0.3 and 0.5, for which the entry in the Table is 1.03; thus, if the 
environmental correlation is less than 1.03 times the genetic correlation, 
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TABLE 2 
_1-VhH 
or (1 — A)/B =§ 
Heritability of second trait 
Heritability 
of one trait 1 2 3 4 5 6 7 8 ® 10 
1.00 
2 1.01 1.00 
3 106 1.01 1.00 
4 1.10 1.04 1.90 1.00 
5 1.16 1.08 1.03 1.00 1.00 
6 1.27 1.14 1.09 1.04 1.00 1.00 
* | 1.42 1.29 1.17 1.12 1.05 1.00 1.00 
8 1.71 1.50 1.38 1.23 1.16 1.11 1.00 1.00 
9 2.33 2.07 1.85 1.67 1.50 1.35 1.24 1.07 1.00 


2The genetic correlation exceeds the phenotypic correlation when the ratio of the environmental 
correlation to the genetic correlation is less than (1-A)/B. 


then the phenotypic correlation is less than the genetic correlation. 
The values in this Table are close to 1.00 for heritabilities that are 
small or alike, and for practical purposes in these cases both r’ and R 
exceed or are less than r together. But when the heritabilities are un- 
equal the fraction is larger, for example, its value is 1.50 when h = 0.2 
and H = 0.8. 

The equation for R represents a plane in the 3-dimensional system 
having co-ordinate axes for R, r and r’, but the relationships just dis- 
cussed can be illustrated by plotting the line of intersection of the planes 
R = Ar + Br’ and R = k on the plane R = 0, for various values of k. 
This is shown in Figure 1 for the example h = 0.2 and H = 0.8 with 
R = 0.4r’, and R < rfor?r’ < 1.50r. Lines for R = 0.3 and R = 0.2 
are shown intersected by r’ = 1.50r. Below this R < r and above it 
R>r. Between r’ = 1.50r andr’ = r, R is less than r and r’ exceeds r; 
i.e. the phenotypic correlation is less than the genetic but the environ- 
mental exceeds it. Only the first quadrant of the (r’, r) Cartesian 
system is shown, but extension of the lines for R into the second and 
fourth quadrants demonstrates the conditions under which the genetic 


and phenotypic correlations are of different sign, and the environmental. 
correlation is negative. 


SUMMARY 


(i) The phenotypic, genetic and environmental correlations, R, r 
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4.0 


a 
Above this line R>r / 


A 
Between these, 
y iinesR<r 
andr'>r / ‘ 
Below this line — 


“GENETIC: CORRELATION r 


FIGURE 1 
PuENotypic CoRRELATIONS R = 0.2 anv 0.3 ror HERITABILITIES 0.2 AND 0.8. 


and r’, are connected by the relationship 
R=rVhH +r’ V(1 — — A) 


h and H being the heritabilities of the traits involved. 

(ii) The environmental correlation is negative when R and r have the 
same sign only if R/r < WhH; it is negative when FR and r are of 
opposite sign and R is negative. 

(iii) Equality of the heritabilities implies that when any two of the 
correlations are equal there is equality of all three. 

(iv) The phenotypic correlation exceeds (or is less than) the 
genetic correlation according as the ratio of the environmental to 
the genetic correlation exceeds (or is less than) the value of 


(1 — VhH)/V(1 — h)(1 — H). 
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RELATIVE EFFICIENCIES OF HERITABILITY 
ESTIMATES BASED ON REGRESSION 
_ OF OFFSPRING ON PARENT" 


B. B. Bouren, H. E. McKean, 
Population Genetics Institute, Purdue University, Lafayette, Indiana, U.S.A. 
AND 
Yukio YAMADA 
National Institute of Genetics, Misima, Japan 


INTRODUCTION 


The problem of optimal estimation of the coefficient of regression of 
offspring on parent (in the sense of minimum variance), when the 
number of progeny per parent is arbitrary, was completely solved by 
Kempthorne and Tandon [1953]. Prior to this paper, two methods 
were (and still are) commonly used: (1) the regression of the phenotypic 
mean of all offspring of a given parent on the parent’s record; (2) the 
regression of offspring on parent, in which the parent’s record is repeated 
for each of its progeny. Kempthorne and Tandon’s technique, which 
they refer to as (3) the weighted regression technique, assigns weights 
to the progeny means which are functions of the number of progeny 
and a guessed value of a correlation coefficient p between deviations 
from regression associated with two progeny of the same parent. The 
difficulty here lies in the fact that p is unknown. The success of the 
general technique depends upon guessing p accurately. Presumably 
if the guessed value of p is close to p, the weighted technique is close to 
optimal. The precise effect of a poor guess for p does not seem to be 
known. 

The purposes of this paper are (1) to investigate the nature and 
magnitude of the correlation coefficient p, and (2) to compare the 
efficiencies of the various techniques with respect to data from a popula- 
tion of poultry. 


THEORY 


General: Following the notation of Kempthorne and Tandon, the 
general model for the regression of offspring on one parent would be 


1Journal Paper Number 1659, Purdue University Agricultural Experiment Station. 
481 


Apel 
q 
= q 
q 
bos 


482 “TOMETRICS, SEPTEMBER 1961 


Yu = wy + OX; +exn, (1) 

where 

Y,;, = the phenotypic value of the kth progeny of parent j, 

X; = the phenotypic value of parent j, 

My = average phenotypic value of the offspring population, 

= average phenotypic value of the parent population, 

€;, = the deviation peculiar to the kth progeny of parent j, 
and 


8 = the regression coefficient of Y on X. 
From such a model Kempthorne and Tandon showed that 
= op(1 — 6), 


Ene) = — 6), 
and 


where p, is the correlation between progeny having a common parent 
(X;). This result is completely general when of = Var (Yj) = 
Var (X;) and would apply to situations in which non-additivity existed 
in the underlying genetic model and/or if correlations existed between 
the environments within progeny groups. In such a case, p, could be 
considered as the repeatability between members of a progeny group. 

In order to simplify the considerations to follow, a hbase population 
having certain characteristics is assumed. 


p 


(i) Only additive genetic variance is present. 

(ii) Effects of environment are random, so that the environmental 
correlation between individuals in a progeny group is essentially 
zero. 

(iii) Random mating is practiced. 

In the classical linear regression model, the e;, form a set of mutually ~ 
uncorrelated random variables. That such is not the case in this instance 
can be clearly demonstrated, even in the completely additive situation. 
If we knew the true breeding values (genotypic values in the additive 


genetic model) of both the sires and the dams, and mating was random, 
we would use the model 


Yin = ur 
where 


g; = the true breeding value of the parent of interest, 
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gi = the true breeding value of the other parent, 
6;;, = the individual genetic deviation from the mean of the 
sire and dam due to segregation, 
€;;2 = the random environmental deviation, 
and 
= E(g;) = E(g,). 


Under the assumption (i) of additive genetic variance, the total 
genetic variance o@ is entirely additive; hence it is clear that 


Var (g:) = Var = , 
Var (8552) = o¢/2, 


Var (€;;.) = oF 


and 


This model (2) meets the conditions of the classical model in that 
all random components are uncorrelated. In the application of this 
model, however, only the phenotypic value of one parent is known, 
while the progeny phenotypes center on the parent’s breeding value. 
To illustrate, consider that with purely additive gene effects, h? may 
be considered as the regression of the parent’s breeding value (in this 
case genotypic value) on the parent’s phenotypic value, so that the 
estimate of the breeding value of the parent based on its phenotype is 


= h(x; — (3) 
but the true breeding value of the parent would be 
— uw) = W(X; -w (4) 


where g; = the breeding value of the parent, X; = the phenotypic value 
of the parent, and J; = the discrepancy between the estimated and 
actual breeding values. It is important to note that the progeny of the 
parent will vary around the point determined by (4) and not around 
the point determined by (3). It is this fact which causes the correla- 
tion between the errors of progeny from the same parent. 

Therefore, in applying equation (2) we must replace (g; — mu) by 
(4), yielding, since 8 = h’/2 under the assumptions, 


= My + R(X; + 31; + h) + ik + €ijk (5) 


Comparing (1) and (5) we see that the term in (5) comparable to 
e;, in (1) is 


Cie = 31; + 3(gi — - (6) 
From (5) and (6) is obtained the model 
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Yin =a t+ A(X; — + ei, 


which is virtually identical to (1). The subscript 7 in (5’) is actually 
superfious, since the effect of the sire is embedded in €;;, , so that 
dropping the pubscript z yields model (1). 

Since E(ej) = 0, it follows from either (6) or (4) that HT) = 0. 
Also, the variance of J; would be 


E(Ij) = (1 — h’)og . 


Special Cases: I. If we now make the further simplifying assump- 
tion (iv) that no two progeny of the given parent (the dam) have the 
same parent of the opposite sex (sire), we may rewrite (5) as, © 


Yu = Cyr + B(X; + 31; + = + + €jk (7) 
Now from (6) or (7) we see that, 

E(x) = h’)oG + + 306 + oR =(1- h‘/4)op 
and 


= = 31 — Bod. 
Then 


p = [Cov , Var (ex) Var 
= — — h'/4)op = — /(4 — Bi’). 


This result is exactly analogous to the result of Kempthorne and Tandon, 
for under the assumption (i)—(iv), their p, = h*/4 and B = h’/2. 

II. It often happens, especially in poultry breeding operations, that 
matings are made up so that a sample of s sires are each mated to a 
distinct random sample of dams so that d; dams are mated to sire 7. 
In order for our assumptions to meet this situation it is only necessary 
to replace assumption (iv) by a new assumption to the effect that, (v) 
all progeny of a given parent of interest (the dam) have the same parent 
of the opposite sex (sire). We may now consider Y as a function of the 
breeding values of the sire and dam, assuming completely additive 
gene effects, such that 


Yin = wy + — + — + + , (8) 


which is analogous to (2). Clearly the expected progeny mean for a 
given sire depends upon the breeding value of the sire, or 


E(¥ | 4) = wy + — - (9) 
It is always the case, however, that only the dam’s phenotype 
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(X;;) is known, hence from (8) and (4) we may write 


Yn th (Xu (10) 
or 
Yin = t+ —w (11) 
where 
(12) 


This clearly shows that the genetic interpretation of the coefficient of 
regression of offspring phenotypic values on the dam’s phenotypic value 
is the same, regardless of whether the dams are mated to a random 
group of sires or to a single sire. On the intra-sire basis, 6 is estimated 
within sires, one estimate for each sire, and a linear combination of 


these estimates is taken as a single point estimate of 8. From (11) we 
see that for a given sire 7, 


Et ix | 1) = 0, 
Elin 1) + E( + 
op(4 — — h')/4. 


This variance is less than the variance of e;, from model (7) because 
e’,, does not contain an effect due to sire. The covariance is the same 
under both assumptions since from (12) we see that 


The correlation p* between the errors of progeny of a given dam for a 
fixed sire would then be 


p* = [og(1 — h’)/4]/[or(4 — — h’)/4] = — — 


This result is again equivalent to that obtained by Kempthorne and 
Tandon, since 


p* = (pf — B*’)/(1 — 6*’), 


where, under the assumptions including (v), p* is the correlation between 
progeny of the same dam in the conditional population of progeny of 
a given sire, and equals h’/(4 — h’), while 6* would have to be the 
correlation between parent and offspring in the conditional population 
or h?/-V/4 — h’, since oy , being the conditional variance within sires, 
is less than og . For comparison, p* can also be evaluated in terms of 
the parameters under the assumptions including (iv), in which case, 
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p* = (p, — B)/(1 — — B’). 


Magnitude of p and p*: The quantity p or p*, depending on the 
mating structure, is the intrinsic parameter upon which the Kemp- 
thorne and Tandon weighted regression technique is based. Therefore, 
it seems important to consider the possible values which p or p* may 
assume, in order to be able to choose the most efficient method of 
estimating the regression of interest, or, if the weighted technique is 
used, to enable an enlightened “guess’’ as to its value. 

Examination of the values of p and p* in terms of h’ shows that 
p = p* = 0 if, and only if, h” = 0 or 1. Conversely p or p* > 0 for 
0 < h? <1. Elementary analytical techniques show that p reaches a 
maximum value when h? = .536, at which value p = .067. Similarly, 
p* achieves an absolute maximum when h’ = .586, when p* = .079. 
The functional relationship between p and h’ is illustrated in Figure 1. 
It is recognized that these may be minimal values for p and p*, since 
failure of assumption (ii) in particular will inflate these correlations. 
The effect on p of relaxing assumption (i) is not clear. The important 
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point is that these values would be unlikely to be large especially in a 
poultry population, where environmental correlations would be small 
or non-existent. In some larger species such as dairy cattle, such 
environmental correlations could have sizable values. 

A primary advantage of the offspring on parent regression technique 
is that unbiased estimates of h” may be obtained when the parents are 
selected for the trait under consideration. It is of interest to determine 
the effect of selection of the parents on the correlation between the 
errors and indirectly on the variance of the estimated 6’s. It has already 
been observed that a general expression for p under any mating structure 
would be 


ep — 9°) = [a — — 


where p, is the phenotypic correlation between progeny of the same 
parent and r is the correlation between parent and offspring, 8 being a 
constant. It is observed that for special Cases I and II, . 


pi = [Cov Yyx-)]/Var (Yin) = [Box + 
= B(ox/ox) + — h’)(o6/oy). 
Substituting this value for p, in the preceding equation yields 
= — — = — h’)oG/(or — Box). 


It is clear that as the usual uni-directional selection occurs on the parents, 
the values of both the numerator and denominator will decrease. To 
evaluate this change it is now necessary to evaluate oj for a specific 


mating structure. For illustration assume a hierarchal structure in 
which 


oy = Box — hog + +z. 
Substitution of this value for o; in the preceding equation yields 


It is seen that while of may be much smaller than o} , due to selection 
of parents, the value p* is independent of the value of c¢. Similarly, 
in special Case I where each progeny of the dam has a different sire, 
it can be shown that the value of p is unaffected by selection of dams. 

Comparison of the three estimation procedures: All three procedures 
for estimating 8 are of course unbiased.- The variances of the estimates 
are the prime consideration. For any linear unbiased estimator of 8, 
in which 
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the variance will be | 


where 


w;2;/ 20 W; T = p/(l — p), 
o” = Var (e;), 


and 
w; = a weighting factor applied to the information from the 
jth dam. 
If a hierarchal mating structure is used, p would be replaced by p* and 
T by T* = p*/(1 — p*) in this discussion. 
Kempthorne and Tandon showed that the minimum variance re- 
sulted when 


w; = n;/(1 + 7,7), 
in which case the variance (14) reduces to 


oj, = — — (15) 


and 


a) 

This is the optimum weighting applied to the Kempthorne-Tandon 
weighted regression technique or method (3). However, the value of 
7 is never known exactly so that it is necessary to guess a value for T, 
which may be indicated by 7. The weights then are 

w; = n;/(1 + 7;7), 
and the variance (14) reduces to 


ttn? (x; — 


where 


This value is larger than (15) since the weighting is not optimum, 
but will approach the minimum variance (15) as 7 approaches 7’. 
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Method (2), or the technique in which the dam’s record is repeated 
for each progeny record, is in fact a special case of the Kempthorne- 
_Ta don technique in which 7 is set equal to zero. In this case w; = n; 


and when these weights are used in (14) or 7 considered to be zero in 
(16) the variance becomes 


of, = = p) + n,T)(z; n(x; — %)’)’, (17) 


in which 7 = nj;t;/ 
In method (1), in which the progeny means are regressed on tlic 


dam’s record, w; is simply set equal to one. From (14) the variance of 
this estimate is 


and = )., z,/d, where d is the number of dams. 
It may be noted that if n, = n. = --- n, = n, then (14) reduces to 


oh = ofl + — — (19) 


and is the same for all three estimation techniques. 

The minimum variance occurs when J = +. Consequently if 
T = 0, then the repeated dam technique (method 2) will yield the 
minimum variance. If 7 ~ 0, then the weighted technique (method 3) 
will approach the minimum variance provided the estimate r is close 
to T. Since the relationship between h’ and p is clear (granted the 
genetic assumptions), an intelligent guess for r can usually be made by 
applying prior knowledge. 

The variances of the three estimates of 8 are also considerably 
affected by the distribution of the n, values. A detailed discussion. of 
this point will be presented in a subsequent paper. 


AN ILLUSTRATION 


Kempthorne and Tandon found little difference between the vari- 
ances of the three estimates in their illustration involving dairy cattle 
data. As they point out, this result is not surprising since the average 
n was only 1.39 and the estimate of p, was actually negative. Poultry 
data involving larger numbers of progeny should yield more reliable 
estimates of the parameters and more precise estimates of the variances 
of the three estimated 6’s. Data are available consisting of five genera- 
tions (progeny populations for 1952-56 inclusive) of White Leghorns 
previously described in detail: by Yamada, Bohren and Crittenden 
[1957]. ‘The trait considered is percent production to January 1, trans- 
formed to angles. The average n per dam over the five years is 7.1, 
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TABLE 1 


EstTIMATES OF PERTINENT PoPpULATION PARAMETERS AND NUMBER OF PARENTS 
AND PROGENY OBSERVED IN EACH OF FIvE YEARS. 


Year Bi No. sires No.dams Average n 
1952 064 .128 .059 .0557 10 106 5.7 
1953 026 .052 .0282 10 92 9.5 
1954 0572 .096 051.0484 11 78 7.3 
1955 .0688 .055 18 108 7.9 
1956 .0404 .089 .034 0327 20 132 5.4 
Ave. 0513> .100 .616 .0434 7.1 


varying from 5.3 in 1956 to 9.5 in 1953 (Table 1). Since the sires were 
mated for the season to a random group of dams the regressions are to be 
estimated on an intrasire basis. Consequently interest will center on 
estimating p* , p* and 7*. Analysis of variance in the hierarchal model 
was used to derive estimates of the variance components for sires 
(o?), dams and progeny within dams = — p)]. From these 
components were derived estimates of p, (the full-sib correlation in this 
example) and preliminary estimates of h’, for the purpose of estimating 
T*, which in turn, is needed in selecting the appropriate weighting 
factors. These estimates were obtained from the variance components as 
At = + 60), 
aud 
hy = + + 69). 


Each annual value of A? so derived was considered as a preliminary 
estimate of 28 so 8, = h?/2. The estimates 8, and #% were then usec 
to obtain the estimates 7* and #*. The results are shown in Table 1. 
All values are relatively consistent from year to year and none of the 
values for #* are outside the expected range based on the theoretical 
additive genetic model. 

Since the value of 7* is unknown, the estimated value 7* derived 


‘from the data for each year was used in the formulae pertaining to 


the variances of the estimates. To approximate the minimum variance 
under the weighting technique (method 3) 7 was set equal to T*. For 
the repeated parent technique (method 2), 7 was assumed to be zero. 

Estimates of 8 were obtained within each sire group in each year 
by each of the three estimation techniques. The variances of regressions 
in each sire group were estimated by use of the appropriate formula 
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(ie. 15, 17, or 18). The individual sire estimates of 8 were pooled over 
sires in years, using the reciprocals of the estimated variances as weights 
for the corresponding regression coefficients, to obtain a point estimate 
of 8 for each year. The estimated variance of the single point estimate 
of 8 for the year is the harmonic mean of the variances for each sire 
in the year divided by the number of sires in the year. Thus 
and 


63 = (1/6},)- 


To estimate heritability (h’), the estimates of 8 were doubled. The 
estimates of h” and the estimated standard errors for each of the three 
methods of derivation are presented for each of the five years in Table 2. 
It is clear that in each year the estimated standard errors of the re- 
gression estimates based on progeny means (method 1) are the largest. 


TABLE 2 


HERITABILITY ESTIMATES AND STANDARD Errors, BASED ON REGRESSION 
CoEFFICIENTS ESTIMATED BY THREE DIFFERENT METHODS. 


Method 
Year 1 2 3 
1952 .156 + .242 .317 + .204 .281 + .202 
1953 —.006 + .171 —.073 + .166 —.058 + .166 
1954 .333 + .304 .294 + .297 .304 + .296 
1955 110 + .172 .098 + .169 -102 + .169 
1956 .309 + .216 .340 + .198 .342 + .198° 


There is little difference in the efficiencies of methods (2) and (3) in 
these data. Unless the data to be considered have larger values of p 
or p* than observed in the present set of data, there appears to be little 
advantage in using the weighted technique (method 3) in preference 
to the repeated parent technique (method 2). 


REFERENCES 
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QUERIES AND NOTES 
D. J. Finney, Editor 


164 NOTE: On a Formula for the Estimation of the Optimum 
Dressing of a Fertilizer 


F. PimENTEL GoMEs, 
University of Sio Paulo, Sto Paulo, Brasil. 


If Mitscherlich’s law 
y = — 10°°°*”] 


adequately represents the yield y of a crop to which z units of a nutrient 
have been applied, the optimum dressing z* (Pimentel Gomes [1953]) 
is given by the expression 


A 
z* = — b. (1) 


Let xz, be a standard dressing of a nutrient and u the response to it; 
formula (1) can be written 


CLy 
0 “*u) log e 


wu 
x (1/c) log Te + (1/c) log — 
where, as before, w is the unit price of the crop yield, and ¢ the unit price 
of the nutrient. This formula can be simplified through the expansion 
of its first term, as we shall show. 

Let 


z = cz,/loge, 


¥ = (1/) log Togs = 1/9 


We have 


dY 1 1 


But it is known (Cramer [1946] p. 123). that 


z* 
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where B,; , B, , --- are Bernoulli numbers. This power series converges 
for | z| < 2z, and therefore can be integrated with this restriction. 
We obtain: 


2 4 
Y = (1/c) log — + - 
whence, 


= z(1/2) — (1/24)(ex,/log e) + (1/2880)(cr,/log e)* — 


The last series is alternating. If we keep only the first term, the error 
committed is less than (x,/24 log e) cx, , that is, less than cz?/10. 
If two terms are kept, the error of the approximation is less than 


(x,,/2880)(cx,/log e)*. 


Evidently, in applications, the first term gives a rather good approxi- 
mation suitable for most cases, and this first approximation is inde- 
pendent of the parameters of the curve. To show how good this approxi- 
mation is we give in Table 1 exact and approximate values of Y for 
the standard dressing (x,) of 60 kg/hectare, which is suitable for most 
cases in practice. 


TABLE 1 
Values of c Values of Y (kg/hectare) 
(hectares/kg) Exact Approximate 
0.0020 29.4 30.0 
0.0050 28.3 30.0 
0°0090 26.9 30.0 
0.0100 26.6 30.0 


In most cases the approximate value 
z* = (1/2)z, + (1/c) log (wu/iz,), (2) 


exceeds the true value, formula (1), by less than 5 percent. 

From (2) we conclude also that, when the additional income (wu) 
produced by the nutrient applied is equal to the additional cost of 
fertilization (éz,), the optimum dressing is still approximately (4)z, . 

Let us suppose, now, that x* will be estimated with the aid of a 
value of c obtained from a large group of previous experiments. Then 
we may take c as constant and obtained from either formula (1) or (2) 


j 
| 
i 
: 


494 BIOMETRICS, SEPTF™1RER 1961 


dx* = (log e/c) ; 
hence an estimate of the variance of z* is 
23" 
- 
V(z*) = (log 
where the response u is supposed to have been estimated by the differ- 
ence between two means of r replications each. It is interesting to 
note that this estimate of V (z*) is independent of the unit prices w and ¢. 
On the other hand, if u and c are estimated with data of the same 
experiment, then it is easier to use formula (2), which gives 


= (log e/é) — log ; 


hence 


V(x*) (log e/ué)* V(u) + (1/64(log V@) 


— 2(log Cov (u, é). 
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BOOK REVIEWS 


J. G. SKELLAM, Editor 
Members are Invited to Suggest Books for Listing of Review to the Editor 


. EKAMBARAM, &. K. The Statistical Basis of Quality Control Charts. A 
12 Manual for Business and Factory Managers. Bombay and London: Asia, 


Publishing House, 1960. Pp x + 96. 16 Tables and 14 Diagrams. Rs. 6.50, 
16s.6d. 


L. R. SHenton, University of Manchester, Manchester, England. 


This short introductory account of control theory and practice is intended for 
managerial and technical personnel, and also for initial courses at university level. 
After an account of frequency distributions leading to the normal law, the control 
chart for quality is described, and then the control of fraction defective, concluding 
with remarks on acceptance sampling. In describing the underlying statistical 
ideas Professor Ekambaram is unusually lucid and achieves considerable success, 
although at times the terminology is unconventional. 

It seems unnecessary to have evaluated separately and in detail the first four 
moments of the Normal, Binomial and Poisson distributions; in any case little use 
is made of ws and pws. On the other hand there is scant reference to the normal 
probability integral, and few values are quoted. The use of ,C, for the usual binomial 
coefficient (7) may be confused with n times C, , and the printing of >> z/N as 
1/N > z seems unfortunate. The formula for a normal probability density is 


incorrectly printed on p.21. The printing is sometimes tenuous and certainly 
lacks uniformity. 


LIEBERMAN, GERALD J. AND DONALD B. OWEN. Tables of the 
13 Hypergeometric Probability Distribution. Stanford: Stanford University 
Press, 1961. pp. vi + 726, Part I:6 Tables; Part II: Tables; Appendix. $15.00. 


R. A. Brapugy, The Florida State University, Tallahassee, Florida, U.S.A. 


This book of tables gives a comprehensive tabulation of the hypergeometric 
probability distribution together with discussions of typical applications. In the 
notation of the book 
p(x) = p(N,n,k, xz) =kin\(N —k)“N —n)!/(k —k —n+2)! 
where max [0, n + k — N] < z < min [n, k]. P(z) is used as the cumulative dis- 
tribution. Symmetries are noted that permit reduction of the volume of the tables. 
Part II of the book and the Appendix consist of tables. The main section of tables, 
pp. 33-627, is a tabulation of p(x) and P(x) for N = 2,n = 1 through N = 100, 
n = 50. The second section of tables, pp. 628-705, shows values of the same quan- 
tities for N = 1000, n = 500. The third section of tables, pp. 706-713, gives p(x) 
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and P(x) for N = 100, n = 50 through N = 2000, n = 1000 with | - xn —-—1,n; 
n = N/2. The Appendix consists of values of log N! for N = 1, -:: , 2000. The 
computations were effected on an IBM 704 computer and sufficient checks were used 
to insure the six-place accuracy given for all values of p(x) arid P(z). 

Part I of the book deals with comments and definitions relative to production 
of the tables, applications of the tables, and approximations to the hypergeometric 
probability distribution. The discussion of applications is clear, concise and useful. 
The examples given are Applications to a Sequential Procedure, Applications to 
Tests of the Equality of Two Proportions, Applications to the Distribution of the 
Number of Exceedances, Applications to the Binomial Distribution (Bayesian 
Prediction), and Applications to Sampling Inspection. A bibliography containing 
sixty-six references will be useful to users of these tables. 

Tables of the Hypergeometric Probability Distribution will become an important 
reference volume and one recommended to all working in areas of relevance. 


PILLAI, K. C. S. Statistical Tables for Tests of Multivariate Hypotheses. 
14 Manila: Statistical Center, University of the Philippines. 1960. pp viii + 46. 


M. J. R. Heaty, Rothamsted Experimental Station, Harpenden, England. 


Most of the significance tests in common use can be regarded as special cases 
of the F-test for the equality of two estimated variances. The corresponding 
multivariate test would be one for the equality of two variance-covariance matrices, 
and the natural requirement that the result of the test should not depend on the 
scales of measurement implies that the test criterion must be some function of the 
latent roots (eigenvalues) of the two matrices. Denoting the matrices by S: and S:, 
all tests so far proposed have in fact been based on the latent roots of the “‘ratio”’ 
S:S;* or of $,(S: + S:)~. Wilks’ A criterion is equal to the product of the roots, 
i.e. the determinant of the matrix, while other authors have suggested using the 
sum of the roots or the largest or smallest root. 

Pillai’s tables are for use with the last two criteria. Table 1 gives the upper 
5% and 1% points of the largest latent root for s, the number of measurements, 
equal to 2(1)6, while Tables 2 and 3 give the same percentage points for the sum 
of the roots of S:(S: + S2) and of S,Sz" respectively for s = 2(1)8. The last 
criterion is equivalent to Hotelling’s T>. Significance levels of the largest latent 
root have been tabulated very fully by F. G. Foster and D. H. Rees for s = 2, 3 and 
4. Otherwise, the tables appear to break new ground. A short introduction includes 
several examples of multivariate problems to which the suggested tests can be applied. 

The user of the tables is not given all the help he might expect. The parameters 
are net as convenient as the degrees of freedom in the Foster-Rees tables, and the 
parameter values make interpolation awkward. No guidance on interpolation is 
given in the introduction, and there is no indication of the accuracy of the tabular 
entries although these are based on approximations to the true distribution functions. 
In tables 2 and 3, the 5% and 1% values for a given s do not appear at a single 
opening of the book. 

In spite of these shortcomings, the tables ought to be widely used, if only to 
provide experience on which to base further work. Several questions cannot at 
present be answered. In particular, no guidance can be given in choosing between 
the three alternative tests. Another open question concerns the robustness of the 
tests to non-normality; analogy with the ordinary F-test suggests that they may be 
unduly sensitive in this respect. 
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YATES, F. Sampling Methods for Censuses and Surveys. 3rd Edition. 
15 London: Charles Griffin and Co. Ltd. 1960. Pp xvi + 440.54s. 


M. D. Mountrorp, The Nature Conservancy, London, England. 


This book is still being read after ten years since its first edition and has thus, 
according to the modern definition, achieved the status of a classic. 

This, the third edition, is a reprint of the second edition with an extra chapter 
on the use of electronic computers in the analysis of censuses and surveys. The 
value of high-speed computers in large-scale survey work is now unquestioned; 
Dr. Yates’ exposition of their workings and the timeless character of the earlier 
chapters brings this book right up to date. 

-The new chapter begins with a concise description of the main features of a 
computer and of the principles of constructing a programme of machine orders. He 
then presents a general programme for the analysis of survey results. This same 
general programme, with slight modifications, will also serve to instruct the computer 
to analyse the results of many different types of sampling schemes, including the 
simple random. stratified, ratio, regression and multiphase methods. The exposition 
is clear and unvarnished, though the reader may be dazzled by the simplicity of the 
unified treatment of the different sampling methods. 

As a standard manual for the practical survey worker this book, to my knowledge, 
still has no equal. 
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ABSTRACTS 


The following are abstracts of papers presented at meetings of the 
British Region on February 28 and April 18, 1961. 


P. D. OLDHAM (M. R. C. Pneumoconiosis Research Unit, Llandough Hos- 
753 pital, Penarth, Glamorgan). The Distribution of Arterial Pressure in the 
General Population. 


Blood pressure measurements of the general population form smooth distribu- 
tion curves whose means and standard deviations vary with age and differ between 
the sexes. No other common factors influencing arterial pressure to a major extent 
have been discovered, nor does it appear, from second surveys of the same samples, 
that the distribution of change of pressure will materially depend on simple, common 
factors. The interpretation of these distributions raises the problem, occurring 
in all fields of medicine, of distinguishing normality from abnormality. The tendency 
is for unsatisfactory and arbitrary rules to be adopted for this purpose, rules which 
ignore the evident fact that abnormality cannot be diagnosed from the result of a 
single test, since the innumerable factors influencing function must be mutually 
correlated. 


L. R. TAYLOR (Rothamsted Experimental Station, Harpenden, Herts.). 


= A Power Law Transformation for Aggregated Populations. 


The individuals of any species affect each other in many ways. The total effect 
of this attraction or repulsion appears, in the spacial distribution of the population, 
as a departure from the statistically simple ideal of randomness. The variance is 
affected and some powerful analytical technique are inapplicable. In set experiments, 
where the mean varies only 100 or 200% between treatments this can be overcome 
by a transformation devised for the occasion and not necessarily very effective at 
other population densities. To be effective in field work, where means may cover 
6 or more log cycles, a transformation must have a sound basis. 

Such a system of transformations has been found, empirically, which fits all 
data so far available. It derives from the hypothesis that variance is proportional 
to a fractional power of the mean (s* = am). Considerable evidence supports 
this; only 1 out of over 30 populations examined shows appreciable deviation. The 
index b appears to be much more specifically stable than the factor a which varies 
with sampling method, population trend etc., (which may be very local e.g. increase 
by reproduction). It is suggested that b is a specific Index of Aggregative Behaviour, 
present in all individuals, possibly influenced by environment but independent of 

population density and trends. 

The transformation function is ¢(m) = Q jf m-*/2 dm where ¢(z) is the trans- 
formation for individual counts, Q is a constant and m and b are derived from the 
power law for variance. a disappears in transformation which therefore remains 
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the same for the same species with different sampling practices in the material so 
far examined. (For further details see Nature (1961), 189, 732-5.) 


755 G. HARRINGTON (A. R. C. Statistics Group, School of Agriculture, Cam- 
bridge). Studies of Visual Judgments of Quality in Bacon. 


The series of experiments to be described studied the manner in which ex- 
perienced and naive judges handled their rating scale when visually assessing rela- 
tively simple characteristics of bacon. The sorts of attributes involved were the 
“proportion of lean to fat’’ on a cut surface revealed when a bacon side is cut into 
two halves, and the component features of this. Most experiments were carried 
out using photographs, in some cases mailed to the judges, although a few involved 
actual bacon sides. Balanced incomplete block arrangements were used so that the 
average quality of the various batches judged could be varied systematically. The 
analysis was concerned with the relative importance of variations in scores intro- 
duced by alterations of “judging standard” (position on rating scale) from batch 
to batch, judge differences etc., and the interrelations between various scores and 
measurements which may have inffuenced them. 


J. M. TANNER and M. J. R. HEALY (Institute of Child Health, Gt. Ormond 
756 St., London, W. C. 1. and Rothamsted Experimental Station, Harpenden, 
Herts.). Assessment of Maturity from X-rays of the Wrist and Hand. 


The bones of the wrist and hand pass through a number of distinguishable 
stages of growth during the period between birth and adulthood, and their overall 
state of development may be taken as a measure of the individual’s level of physical 
development. This is often done by way of an assessment of “skeletal age’. A 
scoring system for deriving skeletal age from an X-ray will be described. 


The following are abstracts of papers presented at the meetings of E.N.A.R. held 
at Cornell University, Ithaca, N. Y., on April 20, 21, 22, 1961. 


757 W. H. BEYER and R. E. BARGMANN (Virginia Polytechnic Institute, 
Blacksburg, Virginia). Symmetrical Complementation Design. 


This design is intended for those experimental situations where the total amount 
of three treatments is a constant. The levels of the treatments are referred to a 
common unit of measurement, and are equally spaced. Certain cell entries are 
omitted to insure complete exchangeability of the three treatments. The usual 
additive model is assumed., This design differs from the usual types of design, in 
that the number of estimable contrasts is limited. Estimable functions in one 
treatment only and in two treatments are presented for the general case of p levels. 
Several methods are employed in order to obtain estimates of the treatment effects 
under various constraints. These estimates are rather meaningless quantities, as 
it is only when they are combined in estimable functions that unique results are 
obtained. Sums of squares and test statistics are presented for the various 
estimable hypotheses formulated. This paper shows that the “hypothesis of sub- 
stitution” is one of the most important to consider. If accepted this says that 
applying one treatment at a low level and another at a high level does not produce 
results which are different when the two treatments are interchanged. Indication 
of how one might consider response functions for single treatments is given. The 
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analysis is also extentled to an analysis of covariance and then further to a multi- 
variate analysis. Recommendations for interpretation and statement of limitations 
are made in detail. 


A. E. BRANDT (Statistical Section, Agricultural Experiment Stations, 
758 University of Florida, Gainesville, Fla.). The Analysis and Interpretation of 
Half-Replicate Experiments. 


An IBM 650 program for analyzing the data from a factorial experiment in- 
volving not more than 8 factors or independent variables or from an experiment 
which can rranged in factorial form is presented. Of the 8 factors, 6 must have 
less than 10 classes and the remaining 2 may have 10 or more. 

The 650 may be used to design a half-replicate experiment, that is, to designate 
the treatment combinations to be used, and to analyze the results. The output 
consists, in the case of a (2)" half-replicate experiment, of (2)" — 1 cards. Of these, 
2" — 2 contain the information concerning variances. These cards occur in pairs, 
one member of each being called an alias, on the basis of variances. © 

The data from a (2)%(4)? experiment presented by W. H. Horton, Westing- 
house Electric Corporation, to a seminar July 14, 1960 were analyzed by this program 
and the results submitted. A question is raised as to the interpretation of results. 
The effect of temperature level proved to be highly significant but one wonders 
if its alias, in this case a high order interaction involving the other 2-level factor and 
2 levels of each of the 4-level factors, is to be ignored. 

The results of a (2)5(4)? field experiment over 4 years were analyzed and the 
results presented with both identifications given for each separate sum of squares. 
The question is again raised as to which member of a pair is to be accepted. 


BYRON WM. BROWN, JR. (School of Public Health, University of Minne- 
759 sota, Minneapolis, Minnesota). Some Characteristics of the Spearman- 
Karber Estimator in Bioassay. 


Let the dose-response function in quantal assay be a distribution function with 
mean yp. An experiment for estimation of » involves n subjects tested at each of 
the dose levels x; = zo + id, i = 0, +1, +2, +--+ , where n, d and Zo are fixed. The 
Spearman estimator is defined for this infinite experiment and shown to have finite 
mean and variance. Maxima are obtained for the bias, taken over all placements, 
zo , of the dose mesh, for (i) all distribution functions and (ii) all unimodal dis- 
tribution functions with specified maximum slope. These maxima are compared 
with the sequence of bounds obtained using Euler-MacLaurin formulae. The 
mean square error of the Spearman estimator is given. for zo randomly chosen over 
(0, d). The minimum mean square error of the estimator, for random choice of 
Zo and fixed n’ = n/d, occurs when n = 1 andd = 1/n’. As n’ becomes infinite 
the estimator is consistent. The asymptotic variance of the estimator is defined 
and used to define asymptotic efficiency relative to the information. The only 
symmetric dose-response function, with » as a translation parameter, for which 
the Spearman estimator has full asymptotic efficiency is the logistic distribution. 
There are distributions (with first moments) for which the Spearman estimator 
has asymptotic efficiency arbitrarily close to zero. High efficiencies are computed 
for the parametric models commonly used in bioassay. When the scale parameter is 
unknown the asymptotic efficiency of the Spearman estimator is at least that for the 
case of scale parameter known. 


ABSTRACTS 


RICHARD G. CORNELL (The Florida State University, Tallahassee, Fla.). 
760 Tables of Sample Sizes and Applications for Estimating some Monotonic 
Functions of the Ratio of Two Independent Poisson Variates. 


The calculation of the efficiency of a vaccine or of an air sampling device is 
an example when it is necessary to estimate monotonic functions of the ratio of 
two independent Poisson variates. Confidence limits on these efficiencies can be 
obtained by using the approach presented by Bross in “‘A Confidence Interval for 
a Percentage Increase,”’ Biometrics (1954). It is also possible to use this approach 
to complete the sample size necessary to attain a confidence interval of predeter- 
mined length for any true efficiency. Sample sizes computed in this manner are 
tabled and applications are illustrated in this paper. 


JAMES E. GRIZZLE (Dept. of Biostatistics, Univ. of North Carolina, 
761 Chapel Hill, N. C.). Asymptotic Power of Tests of Linear Hypotheses Using 
the Probit or Logit Transformation. ; 


The statistic for testing the fit of a model, or the statistic for testing a linear 
hypothesis under the model, when using probits or logits, has a central x?-distribu- 
tion for large samples if the null hypothesis is true. If it is not true, the test statistic 
has, asymptotically, a non-central x?-distribution with a non-centrality parameter 
that depends on the alternative hypothesis, the model and the transformation. 
Non-centrality parameters associated with tests of the two types of hypotheses 
are derived, and some cases of interest in bioassay when the response is quantal 
are examined. 


JOHN GURLANDP (Mathematics Research Center, U. S. Army, Univ. of 
762 Wisconsin, Madison, Wisc.). Some Bioassay Techniques for the Determi- 
nation of Minute Residues. 


A fortification procedure, whereby known amounts of the Standard material 
are added to weak test preparations, is suggested for biological assays involving 
housefly mortality and employing the ‘‘film method” and “topical method”’ respec- 
tively. According to the former method a thin film of toxicant is distributed on 
the walls of the jar in which the insects are exposed. Increasing doses are admin- 
istered by increasing the volume of fortified extract employed in distributing the 
film of toxicant on the walls of the jar. According to the latter method, a constant 
volume (one microleter) is applied to the mesonotum of each fly exposed. This 
requires a separate fortification for each dose. By maintaining a constant ratio of 
toxicant added to toxicant present it is possible to apply the tests of linearity and 
parallelism and also to obtain an estimate of relative potency. Practical considera- 
tions in preparing solutions of desired strength (although the potency is unknown 
and must be estimated), are suggested, whereby a trial value of the relative potency 
is employed in obtaining the final estimate. 


763 DEWEY L. HARRIS (Iowa State University, Ames, Iowa). A Monte Carlo 
Study of the Influence of Errors of Parameter Extimation Upon Index Selection. 


The theory of genetic selection indexes is such that, with knowledge of certain 
genetic and phenotypic parameters, the index which will yield maximum genetic 
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improvement may be chosen. However, in practice, these parameters are not 
known exactly and estimates are used in index construction. _The inaccuracies of 
estimation résult in indexes which will yield progress somewhat less than the maxi- 
mum attainable progress. The errors of parameter estimation also lead to inac- 
curacies of estimating the progress from selection on a particular calculated index. 

Parameter estimation from analyses of variance and covariance among traits 
of individuals classified into paternal half-sib groups was considered. The magnitude 
of the mean decrease in progress, the tendency to over- or under-estimate progress, 
and the accuracy of estimating progress were evaluated by “Monte Carlo” sampling | 
procedures and by the development of approximate equations for various combina- 
tions of the true genetic and phenotypic parameters and amounts of data used for 
estimation. The results for these situations indicate that data involving at least 
1000 individuals are necessary for construction of a reasonably effective index. 
However, the accuracy of estimating progress is not very accurate with this volume 
of data. 


H. O. HARTLEY (Iowa State University, Ames, Iowa). Analytic Studies 
of Sample Surveys. 


Analytic studies in sample surveys are concerned with estimating and comparing 
the mean-characteristics for certain sub-sections of the population called ‘domains 
of studies’. The first part of the paper is concerned with formulas for the estimation 
of the totals and means of characteristics for all the units in a domain. Formulas 
of the variances of the estinates are also provided. 

The second part of the paper raises the question of ‘optimum design’ for analytic 
studies and formulates this as a problem of minimizing the survey cost subject to 
tolerances for the variances of domain comparisons. Non-linear programming is 
applied to a simple special case in which domains are strata. - 


THEODOR HEIDHUES (Department of Animal Husbandry, Cornell Uni- 
765 versity, Ithaca, New York). Relative Accuracy of Selection Indices Based 
on Estimated Genotypic and Phenotypic Parameters. 


Empirical sampling techniques were used to investigate the effect of errors of 
parameter estimation on the accuracy of the selection index method. Under the 
assumptions of multivariate normal distribution of genotypic values and phenotypic 
observations and no genotype-environment interaction, samples of various sizes 
from two classes of underlying distributions were generated by an electronic com- 
puter. Genotypic values and phenotypic observations were computed such that 
expected values, variances and covariances of generated variables were equal to 
the respective underlying population parameters. 

The first class of problems was concerned with indices which include phenotypic 
observations on an individual and its relatives in the same trait. The covariance 
matrix between genotypic values of relatives can be inferred from knowledge of 
the genetic mechanism and need not be estimated. The measure of the relative 
accuracy of an index based on estimated as compared to true parameters was taken 
to be the ratio of realized to expected correlation between genotypic value and its 
estimate by a particular index procedure. The decrease in accuracy due to use of 
estimated parameters depended upon the underlying ratio of genotypic to total 
variance and sample size. Evidence is strong that full utilization of genetic knowl- 
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edge of the population structure increases the accuracy of an index procedure. 
Variables with low partial correlation with the genotypic value to be evaluated 
and high correlation with other variables in the index equation should be discarded 
from the index in certain cases. An index based on an estimated covariance matrix 
yielded almost identical accuracies when applied to the same sample or to different 
samples of the same population. 

The second more general class of problems included indices based on phenotypic 
observations of the trait under selection and on a genetically correlated trait. The 
decrease in accuracy depended strongly on the ratio of genotypic to total variance 
of the selected trait. If estimates of elements of the phenotypic or genotypic co- 
variance matrix are “unreasonable’’, i.e. if they exceed theoretically determined 
limits, they should be modified to increase the accuracy. The occasionally used 
practice of assuming the genotypic covariance between two traits to be zero cannot 
be generally recommended. 


E. H. LEHMAN (Statistical and Computing Laboratory, Purdue University, 
766 Lafayette, Ind.). The Peculiar Variance of the Estimator, of a, the Scale 
Parameter of the Weibull Distribution. 


Assume N units selected randomly from a population whose life span follows 
the Weibull density function, F(t) = 1 — exp [—(t”/a)], t > 0, M, (the shape 
parameter) known. Observe the failure times of the first few units and stop the 
test when F have failed and time T has elapsed. The maximum likelihood estimator 
& of a derived from this test possesses some baffling properties. If R and T happen 
to be fixed and small, the variance of & considered as a function of N, decreases 
at first, then increases for awhile, and later decreases monotonically. Thus it 
appears that for a certain interval of N, a small sample gives more information 
than a large one. The reason for this is that if r, the actual number of failures 
(> R, small) exceeds R by only a small integer, or if d, the actual test duration 
(> T, small), exceeds T by only a small period, the & then employed is badly biased 
and variant. For small N, the probability of using these poor estimators grows 
for awhile before it shrinks, and hence the variance of @ as a whole follows this same 
surprising sinuous pattern. 


DONALD C. MARTIN and 8. K. KATTI (Florida State University, Talla- 
767 hassee, Florida). Fitting Certain Contagious Distributions to Some of the 
Available Data by Maximum Likelihood Method. 


The sample distributions obtained by Beal, [1940], Ecology 21, Bliss and Fisher, 
[1953], Biometrics 9, and McGuire et al., [1957], Biometrics, 13 have been frequently 
employed to test the fit of many theoretical distributions. The distributions that 
have been found to have large enough regions of applicability are the Neyman Type 
A, Negative Binomial, Poisson Binomial, and the Inflated Poisson. All of these 
require numerical methods or tables to estimate the parameters by maximum likeli- 
hood. Specifically the problem of estimating parameters in the Neyman Type A 
and the Poisson Binomial is considerably involved. This has resulted in comparing 
of the newly formulated distributions with the inefficient fits, e.g. moment fits, 
of these distributions thereby confounding the superiority of the new distribution 
with the superiority of the method of estimation. The present authors have obtained 
maximum likelihood fits for most of the data that they feel are promising in studying 
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the problem of curve fitting by using an electronic computer. ishey would like to 
report that the values of the chi squares used for testing the goodness of fit do not 
indicate any consistent good fit by one of these distributions 


768 M. R. SAMPFORD (A. R. C. Unit of Statistics, Aberdeen, Scotland). A 
Problem in Cluster Sampling with Replacement. 


When the units of a population fall naturally into clusters of unéqual size, 
selection of clusters with probability proportional to size, with replacement. of 
selected clusters, provides easily calculated and unbiased estimators of the popula- 
tion mean and the variance of its estimate. Sampling usually proceeds until a 
pre-determined number n of clusters (not necessarily all different) has been chosen: 
under this system some economy of resources is achieved (since the mean of a cluster 
selected twice need be determined only once), at the cost of some increase in variance, 
possibly large when sampling is from a small population or stratum. 

If resources to determine n cluster means are available, a sample containing 
n distinct clusters, and so providing a more precise estimate, might be preferred. 
If the usual method of sampling with replacement is continued until the (n + 1)th 
distinct unit is chosen for the first time, at the (r + 1)th drawing, the sample con- 
sisting of the first r cluster means (corresponding to the first n distinct clusters 
chosen) provides unbiased estimators when analyzed as though r (rather than n) 
had been pre-determined.. 

A small modification provides an unbaised estimator of the sampling variance 
when clusters are sub-sampled. 


MARVIN A. SCHNEIDERMAN and PETER ARMITAGE (National 

769 Institutes of Health, and London School of Hygiene and Tropical Medicine, 
Bethesda, Md. and London, England). A Family of Truncated Sequential 
Plans. 


For the normal deviate variance known, a family of truncated sequential plans 
(called ‘wedge’ plans) with outer boundaries identical with those of Wald, Sobel- 
Wald, and Armitage have been developed. These plans have known, fixed Type I 
and Type II error and constitute a general class, of which the Wald (open) schemes 
are one extreme special case and the Armitage (restricted) schemes with a vertical 
middle boundary, the other extreme special case. 

Plans have been developed for both the one-sided (two-decision), and for the 
two-sided (three-decision) case. Monte Carlo trials comparing ‘‘wedge’’ schemes 
with equivalent open schemes (Wald and Sobel-Wald) show somewhat increased. 
average sample sizes for the wedge schemes in the vicinity Ho , reduced sample 
sizes for values of the parameter, 8, between Ho and H, (6, > 60), and equivalent 
sample sizes for @ > H,. The variance of the ASN appears smaller for the wedge 
schemés at all values of the parameter. Tables of coordinates of the wedge for nine 
common combinations of a and 8 have been computed, and will be published. 


RK. J. TAYLOR and H. A. DAVID (Dept. of Statistics, Virginia Polytechnic 
770 Institute, Blacksburg, Va.). Sequential Allocation of Patients in Clinical 
Trials. 


This paper describes a scheme for sequentially altering the proportion of patients 
assigned to the various treatments of a clinical trial according to the results obtained 
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as the trial proceeds. Superior treatments are, by this means, allocated a higher 
proportion of patients than inferior ones. This alternation of proportion is performed 
by use of a weighting function, several alternative forms of which are given. A 
simulation study of the efficacy of the procedure with regard to its ability to select 
correctly the best treatment is described and the results presented. These results 
indicate that, with the use of appropriate weighting functions, this procedure is 
better able to select the best treatment than an equal allocation trial using the same 
number of patients. This comparison has been made on the basis of Sobel and 
Huyett’s (1957) study of the equal allocation case. The study shows that the 
weighting functions which are most efficacious in correctly selecting the best treat- 
ment are the ones that tend to assign the largest proportion of patients to the best 
treatment. 


A theoretical study of these statistics in special situations is also discussed. 


L. H. WADELL (Department of Animal Husbandry, Cornell University, 
771 Ithaca, N. Y.). Selection Bias in Intraclass Correlation Repeatability Esti- 
mates. 


Research workers in the field of quantitative genetics use functions of variance 
components as estimates of parameters needed in the design and application of 
selection programs. The estimates of these parameters generally have to come from 
selected data. This paper demonstrates the bias that is introduced into the intra- 
class correlation when computed from a one-way classification analysis with unequal 
subclass numbers where the unequal subclass numbers are caused by systematic 
truncation culling. Empirical sampling results are given to demonstrate this bias 
which varies with culling intensity and with the size of the true intraclass correlation. 
The bias introduced by increasing the culling intensity is greater for a low true 
repeatability than for a high true repeatability. A correction technique is given 
for this analysis which eliminates this bias. The accuracy of this method is supported 
by empirical sampling results. 


R. M. ZAKI, B. B. BHATTACHARYYA and R. L. ANDERSON (North 


772 Carolina State College, Raleigh, N. C.).. On a Problem of Production Planning 
Over Time. 


This paper is concerned with the decision problem faced by a firm which produces 
a nonstorable commodity and has to spend large amounts of capital on a specialized 
factor that could be idled part of the time by fluctuations in production. In partic- 
ular, it is assumed that the firm uses f resources, N, units of the rth resource being 
available for production. These resources are to be allocated to one or more of 7’ 
different time periods, n,; units of the rth resource to the ‘th period. Each of these 
mrt units can produce y;;: units of output at a price less direct variable costs of pr: . 
The cost of the specialized factor for being available in any of the 7’ time periods is 
approximated by a constant multiple of the maximum production in any one time 
period. The decision problem is to determine the allocation plan {n,:} which 
maximizes 
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subject to the restrictions 


A linear programming formulation of the problem is given. A procedure is 
developed for finding the optimal solutions to the problem for each value of c in 
the interval c > 0. Explicit optimal solutions are given for a simplified model 
with f = 1. 


The following are abstracts of papers presented to W.N.A.R. at the 
University of Washington, Seattle, Wash., June 14-17, 1961. 


WALTER A. BECKER and LAWRENCE R. BERG (Washington State 
773 University, Pullman and Puyallup, Washington). Factors Affecting the 
Sensitivity of Growth Experiments. 


A series of chick growth experiments utilizing various genotypes and diets 
were performed to investigate the influence of different factors upon the sensitivity 
of experiments. F statistics, M.S. among treatments/M.S. among individuals, 
within treatments in one-lay layouts, were used to measure sensitivity. All body 
weight data were transformed to logarithms. 

In experiments that determined differences among diets, the within treatment 
variance increased as the animals grew older, reached a plateau, and then declined. 
The F value acted in a similar manner. The within treatment variance was greater 
for birds fed sub-optimal diets than for those fed optimal diets. When determining 
differences among strains, the highest F values occurred when birds were given the 
optimal diets. In nutritional research the “best” animals, in terms of producing 
the most sensitive experiment were those with highest nutritional requirements. 


NEETI R. BOHIDAR (Iowa State University, Ames, Iowa). Monte Carlo 


v4 Investigations of the Effect of Linkage on Selection. 


A Monte Carlo investigation was undertaken to study the effect of linkage on 
the efficiency of selection. A program for the ‘‘Cyclone’’, the high speed digital 
computer located at Iowa State University, was written to accommodate any combi- 
nation of the following facets: type of initial population, dominance, epistacy, linkage 
relations, selection intensity and some type of selection. The biological parameters 
involved in the actual numerical work were as follows: two types of population, 
repulsion and coupling; four types of dominance, no dominance, complete dominance 
over dominance, and mixed dominance; three types of character, character expressed 
by males, females and both sexes; three types of truncations, upper extreme, inter- 
mediate and lower extreme; two types of selection intensities, 20/40 and 5/40, 
and nine types of linkage relations, .5, .3, .1, .03, .015, .007, .003, ty and ¢,, where 
ty stands for tight linkage in female and ¢,, stands for tight linkage in male. Graphical 
method of representation was resorted to, to provide a clear picture of the situation. 
The results gave definite indications of the roles of these factors on the effects of 
selection and offered a comparative study of the effects of the combination of dif- 
ferent facets of interest on the efficiency of selection. 
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H. D. BRUNK (University of Missouri, Columbia, Mo.). A Statistic’Related. 
775 
to Kolmogorov’s. 


Let F,, denote the empiric distribution function of a random sample of size n 
from & population. In connection with distributions on a circle, Kuiper introduced 
_ the statistic max, [F(z) — F,(z)] — ming (F(z) — Fa(z)] (Indag. Math.-Proc. Kon. 
Nederlandse Akad. Wet., Ser. A, 63 (1960) 38-47], for testing the hypothesis that 
the population has distzibution function F. The statistic studied here is essentially 
Kuiper’s: it bears the same relationship to Kuiper’s as does Pyke’s (Ann. Math. 
Stat., Vol. 30 (1959), pp. 568-576] modification of Kolmogorov’s to Kolmogorov’s 
itself. The statistic occupies a position intermediate between Kolmogorov’s (Inst. 
Ital. Attuari, Giorn., 4 (1933)°1-11] and Sherman’s [Ann. Math. Stat., 21 (1950) 
339-361], and the corsesponding test appears more powerful than Kelmagevov's 
against certain alternatives (e.g. different scale parameter, for a symmetric distribu- 
tion), and less powerful against others (e.g. different location parameter). Asymptot- 
ically the statistic coincides with that of Kuiper, who gives the asymptotic dis- 
tribution (loc. cit.). A theorem of Sparre Andersen (Skand. Aktuarietidskrift, 36 
(1953) 123-138] makes possible an essential simplification of the problem of deter- 
mining the distribution for finite sample size. After this simplification, methods 
developed for Kolmogorov’ 8 statistic by Kolmogorov, Feller, renee and others 
can be used. Tables are in preparation. 


JAMES L. LEITCH (Laboratory of Nuclear Medicine and Radiation Biology, 
776 School of Mecidine, University of California, Los Angeles, Cal.). Radiation 
Effects and Their Statistical Evaluation: Introduction. 


With the ever-increasing interest in the biological effects of ionizing radiation, 
a review of the various facets in this field is considered as timely. It is necessary 
that all parameters, influenced by (1) radiation characteristics, (2) biological ¢har- 
acteristics, (3) pre-irradiation conditions, (4) post-irradiation conditions, (5) treat- 
ment (protective) factors, and (6) criteria for evaluation of the biological effects, 
must be considered in any statistical appraisal. Initially these factors will be 
presented in outline form and discussed in more detail in subsequent papers. 


JAMES L. LEITCH (Department of Nuclear Medicine and Radiation Biology, 
777 School of Medicine, University of California, Los Angeles, Cal.). Biological 
Factors in Radiation Effects. 


A general review of literature data will be presented on the relationship between 
various biological factors and the radiation syndrome. Special emphasis will be 
placed on the treatment (protective) factors which may modify the basic syndrome. 
New experimental data involving X-ray effects on mice will be presented relative 
to the following parameters; (a) biological variability, (b) a possible seasonal effect, 
(c) cage effect and (d) interrelationship between dose rate and protection. An 
initial approach to the statistical evaluation of radiation experiments will be discussed. 


FRANK J. MASSEY, Jr. and CARL E. HOPKINS (School of Public Health, 

778 University of California, Los Angeles, Cal.). Tables of Exact Sampling 
Distribution of 

The exact sampling distribution of the multiple correlation coefficient R? has 

been computed on the IBM 709 of Western Data Processing Center at UCLA and 
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tabled for various magnitudes of the parameter, sample size, and number of variables. 
The density function and the cumulative distribution function are given in intervals 
of 0.01 in R?, the sample coefficient. 

These distributions should be useful in applications requiring significant tests 
and confidence intervals for sample R?’s where the hypothesis is non-zero, and for 
determining the power and sample size requirements of projected studies, such as 
epidemiologic surveys, in which the population R? is expected to be non-zero. 


ORSELL M. MEREDITH (Laboratory of Nuclear Mccicine and Radiation 
Biology, School of Medicine, University of California, Los Angeles, Cal.). 
Comparison of Dose-Rate Effects on CF-1 Mouse: Mortality Between 250 
KVP X-Rays and Cobalt-60 Gamma-Rays. 


779 


A comparison of acute radiation mortality response with dose rates ranging 
from 2-170 r/min has been performed with CF, female mice. Analyses of response 
have been based upon methods of probit analysis elaborated by Finney. A rapid 
increase in the LD5o(30) level was observed for either Co y-rays or 250 KVP X-rays 
as the exposure dose rate was reduced below 20 r/min. On the other hand, little 
change in L.D50.30) was observed with increase of exposure dose rate above 20 r/min. 
For either type of radiation source there was no significant deviation from parallelism 
when all of the probit dose response curves were considered. In addition, study 
has been made of the comparative applicability to Co® y-rays and 250 KVP X-rays 
results of various mathematical models which have been proposed for radiation 
mortality response. 


STANLEY R. PERSON (Laboratory of Nuclear Medicine and Radiation 
Biology, School of Medicine, University of California, Los Angeles, Cal.). 
Relationship Between Physical Characteristics of Radiation and its Biological 
Effect. 


780 


Physical factors affecting the radiation sensitivity of whole animals will be 
discussed. Factors to be discussed center around changes in the radiation sensitivity 
brought about by use of radiation of different qualities. Data from the literature 
will be presented on the RBE of radiations that give rise to different rates of energy 
loss. A discussion of current dosimetry methods and factors affecting accurate 
dosimetry will be given. 


A. D. WIGGINS (General Electric Company, Richland, Wash.). Further 


= Aspects of a Multicompartment Migration Model. 


The present paper represents an effort to extend the results of an earlier paper 
| Wiggins (1960). On a Multicompartment Migration Model With Chronic Feeding. 
Biometrics 16:4, 642-58] in several directions. First, the earlier model is generalized 
to include the possibility of an independent source or “feeding function” within each 
compartment. Second, a result of 8S. Bernstein [P. Lévy (1948). Processus Stochas- 
tiques et Mouvement Brownien. Gauthier-Villars, Paris. p. 64], namely the derivation 
of the one-dimensional diffusion equation of probability theory starting from a 
stochastic differential equation, is extended to K dimensions. The K-dimensional 
diffusion equation corresponding to the present migration model is then derived 
and an attempt is made to solve the equation in two dimensions. 
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Two numerical examples resulting from the application of the estimation 
procedure of the earlier Biometrics paper to experimental data are presented. The 
sulting graphs are plotted and compared with a plot of the experimental points. 


CORRECTIONS 


J. A. Nelder [1961]. The Fitting of A Generalization of the Logistic Curve. Biometrics 
17, 89-110. 


In the above paper the following reference on page 110 was omitted: 


Skellam, J. G., Brian, M. V., and Proctor, J. R. [1959]. The simultaneous growth 
of interacting systems. Acta Biotheoretica 13, 131-144. 
Also the algebraic expression in the heading of Table 1, (p. 91), should read 


[1/(1 + 


R. C. Elston [1961]. On Additivity in the Analysis of Variance. Biometrics 17, 
209-19. 


The fourteenth line on page 215 should read: “‘null hypothesis, and, if condition 
(5) holds, provides an approximate.” 
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THE BIOMETRIC SOCIETY 
Brazilian Region 
REPORT ON THE MEETING OF THE BRAZILIAN REGION 


On March 29, 1961, the Brazilian Region of the Biometric Society held its 
annual meeting at the Department of Statistics of the Faculty of Hygiene and 
Public Health of the University of S80 Paulo, Sao Paulo, Brazil. 

The first part of the meeting was devoted to the presentation of the following 
scientific papers: 


“Célculo da distancia morfolética em 4 grupos de Laelia a Sp.” by Rolando 
Vencovsky. 

“Relacéio entre a Andlise Tradicional de Experimentos Fatoriais de Adubacao 
de 3 e a' Superficie de Resposta Respectiva” by F. Pimentel Gomes. 

“Andélise de um experimento sébre aplicacio de micro-elementos em cafeeiros”’ 
by H. Vaz de Arruda. 

“Correction for bias introduced by a transformation of variables’ by Jerzy 
Neyman, as a visitor at the University of Sio Paulo. 

The second part was devoted to the annual report for 1960 and the election 
of 1961 regional officers. The report and accounting demonstration for 1960 were 
submitted to the attendant members and approved. 

In accordance with the results of the election, the names of new officers submitted 
for the approval of the Council of the Society are: 


President—Adolpho M. Penha 
Treasurer—Americo Groszman 
Secretary—Elza Berqué 
Council Members: 
Pompeu Meméria 
José T. A. Gurgel 
Frederico Pimentel Gomes 
Ruy Aguiar da Silva Leme 
Frederico G. Brieger 
Geraldo Gracia Duarte. 


British Region 

At a meeting held on April 18th, 1961, the following papers were read and 
discussed: 

G. Harrington: Studies of Visual Judgments of Quality in Bacon. 

J. M. Tanner and M. J. R. Healy: Assessment of Maturity from X-rays of 
the Wrist and Hand. 
W.N.A.R. 

ANNUAL MEETING 


The annual meeting of the WNAR Biometric Society was held at the University 
of Washington, Seattle, Washington, on June 14, 15, and 16, 1961, in conjunction 
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with meetings of the Institute of Mathematical Statistics, the Section on Physical 
and Engineering Sciences of the American Statistical Association, the American 
Mathematical Society, and the Institute of Management Sciences; and a special 
“Symposium on Convexity” sponsored by the American Mathematical Society. 


Program 
Wednesday, June 14, 1961 
7:30-10:00 p.m.—Informal Inferential Procedures (IMS, ASA-SPES and WNAR) 

Chairman: A. M. Mood, C.E.I.R., Inc., Los Angeles, California 

1. “The Future of Data Analyse: ” J. W. Tukey, Princeton University and 

Bell Telephone Laboratories, Murray Hill, New Jersey. 
2. “Some Sequences of Fractional Replicates.’’ C. Daniel, New York. 
3. “Graphical Methods for Internal Comparisons in Multi-response Experi- 


ments.” M. B. Wilk and R. Gnanadesikan, Bell Telephone Laboratories, 
Murray Hill, New Jersey. 


Thursday, June 15, 1961 
8:30-10:00 a.m.—Stochastic Processes in Biology 
Chairman: A. T. Bharucha-Reid, University of Oregon 
1. “Further Aspects of a Multi-Compartment Migration Model.” A. D. 
Wiggins, General Electric Company. 
2. “A Statistic Related to Kolmogorov’s.” H. D. Brunk, University of Missouri. 


10:30 a.m.-12:30 p.m.—Planning and Analysis of Experiments (ASA-SPES and 

WNAR) 

1. “The Consideration of Variance and Bias Errors in the Selection of a Response 
Surface Design.” G. E. P. Box, University of Wisconsin, and N. R. Draper, 
Mathematics Research Center, U. S. Army. 

2. “Orthogonal Main-Effect Plans.” Sidney Addelman, Research Triangle 
Institute. 

3. “Asymmetric Factorial Designs and the Direct Product.” B. Kurkjian, 
Diamond Ordnance Fuze Laboratories, and M. Zelen, University of Mary- 
land and National Bureau of Standards. 


2:15-3:45 p.m.—Radiation Effects and their Statistical Evaluation 

Chairman: James L. Leitch, Laboratory of Nuclear Medicine and Radiation 
Biology, U.C.L.A. 

1. Introduction 

2. “Relationship betweer Radiation Characteristics and Radiation Effects.” 
S. R. Person, Laboratory of Nuclear Medicine and Radiation Biology, 
U.C.L.A. 

3. “Biological Factors Involved in Radiation Effects.’ James L. Leitch. 

4. “Comparison of Dose-Rate Effects on CF-1 Mouse Mortality Between 250 
KVP X-Rays and Cobalt-60 Gamma Rays.”’ Orsell M. Meredith, Laboratory 
of Nuclear Medicine and Radiation Biology, U.C.L.A. 


Friday, June 16, 1961 


8:30-10:30 a.m.—Estimation (IMS and WNAR) 


1. “Combining Information in Incomplete Blocks.” F. A. Graybill, Colorado 
State University. 
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2. “Estimation with Minimum Mean Square Error.” H. O. Hartley, Iowa 
State University. 

3. “Remarks on the Efficiency of Unbiased Estimation with Auxiliary Variates.”’ 
W. H. Williams, Bell Telephone Laboratories, Murray Hill, New Jersey. 


1:00 2:30 p.m.—Contributed Papers 


Chairman: A. D. Wiggins, General Electric Company 

1. “Monte Carlo Investigation of the Effect of Linkage on Selection.” N. R. 
Bohidar, Utah State University. 

2. “Tables of Exact Sampling Distribution of R?.” F. J. Massey, Jr. and Carl 
Ic. Hopkins, U.C.L.A. 

3._ “Various Levels of Riboflavin and the Sensitivity of Experiments on Growth.” 
W. A. Becker and L. R. Berg, Washington State University. 


CHANGES IN MEMBERSHIP 
(January 15-July 15, 1961) 
Changes of Address 


Mr. Ross W. Adams, 1251 Hawthorne, Ames, Lowa, U.S.A. 

Mr. B. L. Adkins, Statistics Department, University of New England, Armidale, 
N.S.W., Australia. 

Miss Margaret F. Allen, School of Aviation Medicine, USAF, Brooks AFB, Texas, 
U.S.A. 

Mr. H. A. J. Amand, 21 Place Cardinal Mercier, Rizensart, Belgium. 

Mr. Donald W. Bailey, Caneer Research Institute, Univ. of California Medical 
Center, San Francisco 22, California, U.S. A. 

Mr. B. O. Bartlett, Agricultural Research Council, Letcombe Regis, Wantage, 
Berkshire, Mngland. 

Dr. Glenn KE. Bartsch, Department of Preventive Medicine, Western Reserve 
University, Cleveland 6, Ohio, U.S. A. 

Mrs. Hannelore Beyer, Haertelstr. 16-18, Leipzig C 1, Germany. 

Mr. Paul Blunk, 4616 Plantation Drive, Fair Oaks, California, U.S. A. 

Mr. W. F. Bodmer, Department of Genetics, University of Cambridge, Cambridge, 
england. 

Mr. Roger L. Bollenbacher, 860 Hiawatha Drive, Elkhart, Indiana, U.S. A. 

M. Jacques Bredas, 24 rue Grand Bry, Montiguy-Le-Tilleul, Belgium. 

Dr. Leroy 8. Brenna, The Texas Company, 12th Floor Chrysler Bldg., New York, 
N.Y 

Dr. A. Brown, Department of Mathematics, Australian National University, 
Canberra City, A.C.T., Australia. 

Dr. Robert V. Brown, Box 181, Edgewood, Maryland, U.S. A. 

Dr. W. R. Buckland, The Exonomist Intelligence Unit, St. James, London, $.W. 1, 
England. 

Mr. A. Burny, 77 avenue des Combattants, Gembloux, Belgium. 

Mr. Lyle DP. Calvin, Department of Statistics, Oregon State University, Corvallis, 
Oregon, UL S.A. 

Dr. A. H. Carter, 1 Hackin Place, Fairfield, Hamilton, New Zealand. 

Mr. Melvin W. Carter, Department of Mathematics and Statistics, Purdue Uni- 
versity, Lafayette, Indiana, U.S. A. 

Mr. David B. Christian, 33 Crestview Drive, Whitesboro, New York, U. 8. A. 
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Mr. Frank B. Cramer, 17331 Tribune Street, Granada Hills, Californis US A 

Mr. Mare Dalebroux, ¢/o Instituto di Genetica, Universita di Pavia, Pasi. . 

Dr. James G. Dare, Department of Pharmacy, University of Queensland, Brisba ane, 
Australia. 

Dr. Richard J. Daum, 3900 Hamilton Street, Hyattsville, Maryland, U. S. A. 

Miss M. FE. Davis, Department of Agriculture, Box 1500, Wellington, New Zealand. 

Mr. R. De Coene, 103 rue Edith Cavell, Bruxelles 18, Belgium. 

M. Jean Dejardin, ORSTOM, 24 rue Bayard, Paris (VIIT°) France. 

Mr. R. Delhaye, 36 avenue Jean Van Raclen, Bruxelles 16, Belgium. 

Dr. Daniel B. De Lury, Department of Mathematics, University of Toronto, 
Toronto 5, Canada. : 

Mr. A. Deville, 2 rue Middelbourg, Boitsfort; Belgium. 

Mr. H. M. Dicks, Department of Agriculture, J. S. Marais Bldg., Stellenbosch, 
South Africa. 

M. Pol Dineur, Golsinnes, Bossiere par Masy, Belgium. 

Miss Irene L. Doto, Communicable Disease Center, 2082 West 38th Street, Kansas 
City 3, Kansas, U.S. A. 

Dr. D. B. Duncan, Department of Biostatistics, The John Hopkins University, 
Baltimore 5, Maryland, U.S. A. 

Mr. Steve A. Eberhart, Department of Agronomy, Iowa State University, Ames, 
Iowa, U.S. A. 

Dr. F. Ectors, 111 rue du Centre, Assesse, Belgium. 

Mrs. Polly Feigl, Olof Skotkonungsgatan 66, Goteborg S. Sweden. 

Dr. Heinz Fink, Morgengraben 14, Koeln-Stammheim, Germany. 

Mr. Robert Fitzpatrick, 5229 21st Avenue, N.E., Seattle 5, Washington, U.S. A. 

Dr. Henry R. Fortmann, Agricultural Experiment Station, Pennsylvania State 
University, University Park, Pennsylvania, U.S. A. 

Mr. Robert A. Harte, Am. Soc. of Biological Chemists, 9650 Wisconsin Avenue, 
Washington 14, D. C., U.S. A. 

Prof. Dr. Jo Hartung, Ruehlmannstr. 8, Hannover, Germany. 

Dr. Don W. Hayne, Patuxent Wildlife Research Center, Laurel, Michigan, U.S. A. 

Prof. Dr. J. Hemelrijk, Keizer Karelweg 83, Amstelveen (N.H.), Netherlands. 

Mr. Jean Henry, 1 rue Defacqs, Bruxelles, Belgium. 

Dr. Paul G. Homeyer, C-E-I-R, Inc., 11753 Wilshire Blvd., Los Angeles 25, California, 
U.S. A. 

Dr. Carl E. Hopkins, School of Public Health, University of California, Los Angeles 
24, California, U.S. A. 

Mr. Paul V. Hurt, Deerfield, Wisconsin, U.S. A. 

Dr. Peter Ihm, c/o Euratom, Casella postale 191, Como, Italy. 

Mr. Arthur G. Itkin, 1870 Clayton Road, Abington, Pennsylvania, U.S. A. 

Mr. Willaim G. H. Ives, Forest Biology Laboratory, Box 6300, Winnipeg, Manitoba, 
Canada. 

Dr. Dubodh K. Jain, Botany Division, Indian Agricultural Research Institute, 
New Delhi 12, India. 

Mr. Eugene A. Johnson, Industrial Engineering, University of Minnesota, 
Minneapolis, Minnesota, U. 8. A. 

Dr. med. Herbert Jordan, Haus Tusculum, Bad Elster, Germany. 

Prof. Dr. Hans Kelleher, Ludwigstr. 18, Munchen, Germany. 

Mr. Thomas R. Konsler, Mt. Hort. Crops Research Station, Route 2, Fletcher, 
North Carolina, U.S. A. 

Dr. Paul Kuehne, Nachodstr. 19, Berlin W 30, Germany. 
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Mr. Thomas Ii. Kurtz, Mathematics Department, Dartmouth College, Hanover, 
New Hampshire, U. S. A. 
Mrs. Katherine B. Ladd, 4 Woodside Drive, South Burlington, Vermont, U.S. A. 
Dr. R. J. Ladd, Physiology Department, University of Queensland, Brisbane, 
Austrafia. 
Dr. G. E. J. Lambelin, 231 Chaussee d’Alsemberg, Bruxelles 18, Belgium. 
Dr. Lonnie L. Lasman, A. J. Wood Research Corporation, 42 South 15th Street, 
Philadelphia 2, Pennsylvania, U. S. A. 
Mr. Jay P. Leary, Jr., 1253 S. Longwood Avenue, Los Angeles 19, California, U.S. A. 
M. Jerome Lejeune, Institut de Progenese, 15 rue de l’Ecole Medecine, Paris (VI°), 
France. 
‘r. Robert Lichter, Ragis-Stat. Heidehof, Brockhoefe, Kr. Velzen, Germany. 
Jr. D. Lindley, Department of Statistics, University College of Wales, Aberystwyth, 
Cards, Wales. 
Mr. George F. Lunger, P. O. Box 583, Camden, New Jersey, U.S. A. 
Marcel J. W. Luttgens, 7 rue Van Oost, Brussels 3, Belgium. 
Prof. Dr. A. Maede, Grasse Steinstr. 81 Halle/Saale, Germany. 
Mr. James E. Mangan, 7443 N. Claremont Avenue, Chicago 45, Illinois, U. S. A. 
Mr. Stuart H. Mann, 511 W. University, Champaign, Illinois, U.S. A. 
Mr. Robert Marechal, J. M., 29 rue Docqs, Gembloux, Belgium. 
Dr. M. Maricz, 16 avenue des Abeilles, Ixelles, Belgium. 
Mr. T. J. Marynen, 11 ring Laan, Berchem-Anvers, Belgium. 
Mr. John W. Mayne, Operational Research Group, Defence Research Board, 
Ottawa, Canada. 
Mr. Judson U. McGuire, European Parasite Laboratory, 20 bis rue Sadi Carnot 
Nanterre (Seine) France. 
Mr. Martin Menzi, Kreuzrain, Hedingen (ZH) Switzerland. 
M. Philippe Merat, 20 rue de Louvre, Viroflay (S. and O.) France. 
Mr. Donald L. Meyer, 1646 S. State Street, Syracuse 5, New York, U.S. A. 
Dr. A. M. Mood, C-E-I-R, Inc., 11753 Wilshire Blvd., West Los Angeles, California, 
U.S. A. 
Prof. Sigeiti Moriguti, Department of Statistics, Stanford University, Stanford, 
California, U.S. A. 
Dr. M. B. Mueller, Plastics Division, Allied Chemical Corporation, Glenolden, 
Pennsylvania, U.S. A. 
Dr. Hugo Muench, Jr., 100 Memorial Drive, Cambridge 42, Massachusetts, U.S. A. 
Dr. Karl Heinz Muller, F. Schelling Str. 3, Jena, Germany. 
Mr. August Carl Nelson, Jr., 1015 Green Street, Durham, North Carolina, U.S. A. 
Dr. A. R. G. Owen, Department of Genetics, University of Cambridge, Cambridge, 
Engiand. 
Dr. Erich Panse, c/o Lochow-Pettkus CmbH, Bergen Krs. Celle, Germany. 
Dr. Benjamin Pasamanick, Columbia Psychiatric Institute and Hospital, Columbus 
10, Ohio, U. S. A. 
Or. Mary Ellen Patno, 2451 Yost Boulevard, Ann Arbor, Michigan, U.S. A. 
Mr R. Pierlot, 5 avenue des Phalenes, Bruxelles 5, Belgium. 
M. Jacques Poly, Service de genetique animale, 16 rue de |’Estrapede, Paris (V°) 
France. 
vir. Joe Powell, 8245 Park Place Bivd., Apt. 6, Houston 17, Texas, U.S. A. 
Ar. Wolf Prensky, Department of Biology, Brookhaven National Laboratory, 
Upton, L.I., New York, U.S. A. 
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Mr. Lester W. Preston, Jr., A. H. Robins Co., Inc., 1407 Cummings Drive, Richmond 
20, Virginia, U. 8. A. 

Mr. Dieter Rasch, Freiligrathstr. 14, Rostock, Germany. 

Dr. Arthur Ringoet, 1 avenue du Congo, Bruxelles 5, Belgium. 

Mr. Erwin Roth, Jaminstr. 30, Erlangen, Germany. 

Dr. O. K. Sagen, 210 Fifth Street, S.W., Washington, D. C., U.S. A. 

Mr. Wilfred Salhuana, Universidad Agraria, Apartado 456, La Molina, Lima, Peru. 

Dr. Hellmut Schmalz, Berliner Str. 2, Hohenthurm-Saalkreis, bei Haale/Saale, 
Germany. 

Cand. Math. Berthold Schneider, Marburger Str. 18, Giessen, Germany. 

Dr. Francesco Sella, Instituto di Genetica, Via S. Epifanio 14, Pavia, Italy. 

Mr. William Seyffert, MPI f. Zuechtungsforsch, Post Bickendorf, Koeln-Vogelsang, 
-Germany. 

Dr. Robert R. Shrode, 111 E. State Street, Sycamore, Illinois, U.S.A. 

Dr. Donald F. Starr, Route 1, Box 321 A, Grand Island, Nebraska, U. S. A. 

Mr. Otto Steiner, St. Wenderlstr. 50, Braunschweig, Germany. 

Mr. N.S. Stenhouse, Division of Math. Statistics, C.S.I.R.0., University of Adelaide, 
Adelaide, S. Australia. 

Dr. Klaus J. Stern, Manhagener Allee 84, Schmalenbeck/Ahrensburg, (Holst.) 
Germany. 

Miss Elizabeth Street, 231 West 13th Street, New York 11, N. Y., U. S. A. 

Mr. R. C. Tomlinson, 74 Gayton Road, Harrow, Middlesex, England. 

Dr. M. Torfs, 99Bd Lambermont, Bruxelles 3, Belgium. 

Mr. Gerard Torreele, 5 Slachthuesstraat, Niouwpoort, Belgium. 

Dr. Jean Vacher, 22 avenue Grammont, Tours (Indre-et-Loire) France. 

M. Raymond Van Den Driessche, 42 rue du Friquet, Bruxelles 17, Belgium. 

Mr. A. Van Parijs, 128 rue de la Loi, Bruxelles 4, Belgium. 

Mr. Thierry Waffelaert, 5 rue J. B. Verlooy, Anvers, Belgium. 

Prof. Dr. Erna Weber, Schenkestr. 8c, Berlin-Karlshorst, Germany. 

Mr. Irving Weiss, The Mitre Corporation, Bedford, Massachusetts, U.S. A. 

Mr. Robert White, Box. 241, Dugway, Utah, U.S. A. 

Mr. Henry K. C. Woo, E-TAI Ltd., 95 Liberty Street, New York 6, N. Y., U.S. A. 

Dr. Gunter Wricke, ueber Lingen/Ems, Klausheide, Germany. 


New Members 
At Large 


Dr. Martin Eugene Dehousse, University of Ruanda-Urundi, B.P. 1550, Usumbura- 
Burundi, Africa. 


Mr. Hong Suk Lee, Last Crop Section, Agricultural Experiment Station, Suwon, 
Korea. 

Mr. Heliodoro Miranda M, Inter-American Institute of Agricultural Sciences, 
Turrialba, Costa Rica. 

Ing. Luis A. Montoya-Armas, Instituto Interamericano de Ciencias Agricolas, 
Turrialba, Costa Rica. 


Australia 


Prof. John Henry Bennett, Department of Genetics, University of Adelaide, 
Adelaide, South Australia. 


Dr. B. Diamantis, 2 Raleigh Street, Windsor S. 1, Victoria, Australia. 
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Mr. Alan BE. Stark, Div. of Fisheries and Oceanography, C.S.1.R.0., P. O. Box 21, 
Cronulla, N.S.W., Australia. 


British 

Dr. P. F. D’Arev, Pharmacology Department, Allen and Hanburys Ltd, Ware, 
Wests. 

Mr. J. P. Evenson, Welleome Research Laboratories, Langley Court, Beckenham, 
Kent, England. 

Mr. P. Hallam, 67 Tomline Road, Felixstowe, Suffolk, England. 

Mr. D. J. Harberd, Plant Breeding Station, Pentlandfield, Roslin, Midlothian, 
England. 

Mr. P. Holgate, “High Canons’’, Well End, Barnet, Herts., England. 

Mr. G. J. Knight, Wellcome Research Laboratories, Langley Court, Beckenham, 
Kent, England. 

Mr. Ian McDonald, 6 Deeside Place, Aberdeen, Scotland. 

Mr. F. M. O’Carroll, 12 Dundela Avenue, Sandycove, Co. Dublin, Ireland. 

Prof. W. T. Williams, Botany Department, The University, Southampton, England. 


Belgian 


Mr. Joseph J. Gabriel, 54 avenue Dr Decroly, Uccle-Bruxelles 18, Belgium. 

Mr. Georges Geortay, 26/60 Avenue Georges Truffaut, Liege, Belgium. 

Dr. L. Goeminne, 98 Chaussee de Gand, Deinze, Belgium. 

Dr. Paul Janssen, Department de Recherches des Laboratories Pharmaceutique, 
Turnhout, Belgium. 

Dr. A. Jeurissen, Sanatorium, Buizingen, Belgium. 

Mr. Francois R. Martin, 5 Rue Caroly, Ixelles, Bruxelles, Belgium. 

Mr. Andre Pieteres, Brusselse steenweg 407, Gentbrugge, Belgium. 

Mr. Francois Sterckx, Yangambi IT, B.P. 1035, Stanleyville, Belgian Congo. 

Dr. Robert Van Vaerenbergh, 6 avenue du Saleil, Knokke, Belgium. 


ENAR 

Dr. Helen Abbey, 615 N. Wolfe Street, Baltimore 5, Maryland, U. 8. A. 

Dr. Elliott T. Adams, P. O. Box 47, Upham’s Corner Station, Boston 25, Massa- 
chusetts, U. 8. A. 


Dr. Daniel J. Baer, 2877 Valentine Avenue, New York 58, N. Y., U.S. A. 

Dr. Harle V. Barrett, Department of Preventive Medicine, The Creighton Univer- 
sity School of Medicine, Omaha 2, Nebraska, U. 8. A. 

Dr. A. F. Bartholomay, 12 Upland Road, Wellesley, Massachusetts, U.S. A. 

Miss Virginia B. Berry, Department of Mathematics, University of British Columbia, 
Vancouver B.C., Canada. 

Mr. Paul V. Blair, Populations Genetics Institute, Purdue University, Lafayette, 
Indiana, U.S. A. 

Dr. John R. Braunstein, 2123 Luray Avenue, Cincinnati 6, Ohio, U. 8. A. 

Mr. Franklin W. Briese, 7500 Olivers Avenue South, Minneapolis 23, Minnesota, 
U.S. A. 

Mr. Paul M. Cohen, Technical Operations Inc., Box 37, Fort Monroe, Virginia, 
U.S. A. 

Mrs. Elizabeth F. Davis, Hazleton Laboratories Inc., Box 30, Biometrical Unit, 
Falls Chureh, Virginia, U.S. A. 
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Dr. Roscoe A. Dykman, Division of Behavorial Sciences, University of Arkansas, 
Little Rock, Arkansas, U.S. A. 

Mr. John R. Flood, 94 Parkway Road, Bronxville, New York. U.S. A. 

Dr. D. H. Fogel, 1380 Bedford Street, Stamford, Connecticut, U.S. A. 

. Dr. Seymour Geisser, Biometrics Branch, NIMH, Bethesda, Maryland, U. S. A. 

Dr. William T. Ham, Jr., Box 877, Medical College of Virginia, Richmond, Virginia, 
U.8; 

Mr. Herman B. Hamot, 835 Forbes Avenue, Perth Amboy, New Jersey, U.S. A. 

Dr. Dewey L. Harris, Statistical Laboratory, Iowa State University, Ames, Iowa, 
U.S. A. 

Dr. Edwin Hendler, 415 Brentwood Road, Havertown, Pennsylvania, U.S. A. 

Dr. Homer C. Jamison, 1919 Seventh Ave., S., Birmingham 2, Alabama, U. S. A. 

Mr. Denis J. Kelleher, Department of Animal Husbandry, Iowa State University, 
Ames, Iowa, U.S. A. 

Dr. Samuel J. Kilpatrick, Statistical Laboratory, Iowa State University, Ames, 
Towa, U.S. A. 

Mr. Eugene Legler, Tennessee Game and Fish Commission, Cordell Hull Building, 
Nashville, Tennessee, U.S. A. 

Dr. Guillermo Llanos-Bejarano, 608 N. Collington Avenue, Baltimore 5, Maryland, 
U.S. A. 

Mrs. Ruth B. Loewenson, 4844 Xerxes Avenue, 8., Minneapolis 
U.S.A. 

Dr. Josiah Macy, Jr., Department of Physiology, Albert Einstein College of Medicine, 
New York 61, N. Y., U.S. A. 

Mr. Robert H. Miller, Dairy Herd Improvement, Agricultural Research Service, 
USDA, Washington 25, D. C., U.S. A. 

Dr. Richard Moore, American National Red Cross, 18th and E., N.W., Washington, 
6, D. C., U.S.A. 

Dr. Donald F. Morrison, Biometrics Branch, National Institutes of Health, Bethesda 
14, Maryland, U.S. A. 

Mrs. Sue W. Nealis, 6657-24th Place, Riggs Manor, Hyattsville, Maryland, U.S. A. 

Dr. Masatoshi Nei, Department of Genetics, North Carolina State College, Raleigh, 
North Carolina, U.S. A. 

Mr. Gill Nestel, 1219 S. State Street, Ann Arbor, Michigan, U.S. A. 

Mr. Marcelo M. Orense, Department of Experimental Statistics, North Carolina 
State College, Raleigh, North Carolina, U.S. A. 

Mr. James G. Osborne, Forest Service, USDA, South Building, Washington, D. C., 
U.S. A. 

Dr. Bernard S. Pasternack, New York University Medical Center, 550 First Avenue, 
New York 16, N. Y., U.S. A. 

Dr. H. V. Pipberger, 7439 Little River Pike, Annandale, Virginia, U. S. A. 

Mr. P. V. Rao, Department of Mathematics, University of Georgia, Athens, Georgia, 
U.S. A. 

Mr. Searle B. Rees, 9 Strathmore Road, Brookline, Massachusetts, U. S. A. 

Dr. Richard 1. Remington, School of Public Health, University of Michigan, 
Ann Arbor, Michigan, U.S. A. 

Mr. J. C. Richards, Jr., The Standard Oil Company, Midland Bldg., Cleveland 12, 
Ohio, U.S. A. 

Mr. Richard H. Richardson, 156° Williams Hall, N. C. State College, Raleigh, 
North Carolina, U.S. A. 
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Mr. Donald C. Riley, American Statistical Association, 1757 K Street, N.W., 
Washington 6, D. C., U.S. A. 

Dr. Ralph Rossen, L. E. Phillips Psychobiological Research Division, Mt. Sinai 
Hospital, Minneapolis 4, Minnesota, U. S. A. 

Mr. Raymond E. Roth, Department of Mathematics, St. Bonaventure University, 
St. Bonaventure, New York, U.S. A. 

Mr. Jagdish S. Rustagi, College of Medicine, University of Cincinnati, Cincinnati 
19, Ohio, U.S. A. 

Mr. Darshan Lal Sachdeva, 308 South Macomb, Tallahassee, Florida, U. S. A. 

Mr. Henry E. Schaffer, Department of Genetics, N. C. State College, Raleigh, 
North Carolina, U. S. A. 

Mr. Marvin A. Schneiderman, Biometry Branch, National Cancer Institute, 
Bethesda 14, Maryland, U.S. A. 

Mr. Wilfred M. Schutz, Department of Genetics, N. C. State College, Raleigh, 
North Carolina, U.S. A. 

Mr. Edward Selig, 130 Beach Avenue, Mamaroneck, New York, U.S. A. 

Mr. John L. Seliskar, U. S. Forest Service, USDA, South Building, Washington 25, 
D. C., U.S. A. 

Dr. C. W. Sheppard, Department of Physiology, University of Tennessee, Memphis 
3, Tennessee, U.S. A. 

Mr. Robert E. Sherman, 4143 Blaisdell Avenue S., — 9, Minnesota, U.S. A. 

Mr. Lawrence E. Sly, Jr., 202 West 9th Avenue, Tallahassee, Florida, U. S. A. 

Mr. Herbert Stern, Jr., 737 Carol Marie Drive, Baton Rouge 6, Louisiana, U. S. A. 

Dr. Claire M. Vernier, Department of Medicine and Surgery, VA Central Office, 
Washington, D. C.. U.S. A. 

Dr. Richard L. Willham, 3624 Ross Road, Ames, Iowa, U. S. A. 

Mr. Ralph P. Winter, 5712-38th Avenue South, Minneapolis 17, Minnesota, U.S. A. 

Mr. Donald F. Wilson, 6127 Westchester Drive, Washington 22, D. C., U.S. A. 

Dr. Charles Wunder, Department of Physiology, State University of Iowa, Iowa 
City, Iowa, U.S. A. 


French 


M. Jean Louis Beaumont, 14 rue Petrarque, Paris 16e, France. 

M. Marcel Brunard, 17 avenue Emile-Deschanol, Paris 7e, France. 

M. Paul Damiani, Institut National de la Statistique, 29 quai Branly, Paris Te, 
France. 

Mme. Jacqueline Roquet, Docteur en Pharmacie, 22 avenue Victoria, Paris 1°, 
France. 

M. Luu-Mau-Thanh, 19 Boulevard Brune, Paris (XIV°) France. 


German 


Dr. K. H. Barocka, Gartenstr. 6, Einbeck/Hann., Germany. 

Prof. Dr. B. Baule, Nibelungenstr. 63, Graz/Oesterreich, Austria. 

Dr. W. D. Froehlich, Rochusweg 12, Bonn, Germany. 

Dr. H. G. Kmoch, Inst. f. Pflanzenbau, Katzenburgweg 5, Bonn/Rhein, Germany. 
Dr. J. Krippl, Nordendstr. 14, Munchen 13, Germany. 

Dr. R. Krussmann, Anthropol. Institut, University, Mainz, Germany. 

Dr. Gunter Wricke, Post Nordhorn, Fa. v. Lochow-Pectkus, Klausheide, Germany. 
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India 


Mr. Prem Narain, Animal Genetics Division, Indian Veterinary Research Institute, 
Izatnagar, U.P., India. 

Mr. G. Narasimharao, Asst. Physiologist, Sugarcane Research Station, Anakapelle, 
India. 

Mr. J. S. Ramaratnam, 293/7 Suifabad Lane, Khairabad, Hyderbad, India. 


Japan 


Mr. Masaki Horie, Haraikata-machi 9, Shinjyuku-ku, Tokyo, Japan. 

Mr. Kiyoo Kimura, Department of Mathematics, Mie Prefectural University, 
11 Ootani-cho, Tsu Mie, Japan. 

Mr. Kozuo Kitamura, Saitama Prefectural Agricultural Experiment Station, Ageo 
City, Saitama, Prefecture, Japan. 

Mr. Masahiko Sugimura, Kumamoto Women’s University, Ooemachi, Kumamoto 

_ City, Japan. 


Netherlands 


Dr. K. J. van Deen, Laboratorium voor Sociale Geneeskund®, Oostersingel 69 I, 
Groningen, Netherlands. 


WNAR 


Mr. Herbert B. Eisenberg, 1329 22nd Street, Santa Monica, California, U. S. A. 

Dr. William R. Gaffey, 3119 Eton Avenue, Berkeley 5, California, U.S. A. 

Mr. William B. Owen, Statistical Laboratory, Colorado State University, Fort 
Collins, Colorado, U. S. A. 

Mr. Patrick K. Tomlinson, Calif. State Fisheries, 511 Tuna Street, Terminal Island, 
San Pedro, California, U. S. A. 
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NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary (if members 
al large, to the General Secretary) news of appointments, distinctions, or retirements, 
and announcements of professional interest. 


NEWS ABOUT MEMBERS 


Victor Chew has accepted a part-time position (starting September 1, 1961) in 
the Department of Biostatistics of the Johns Hopkins University. He will divide 
hi- time between Baltimore, Maryland and Dahlgren, Virginia, where he is a mathe- 
matical statistician in the Operations Research Branch of the U. 8S. Naval Weapons 
Laboratory. 

John J. Gart will spend 1960-61 on a Postdoctoral Research Fellowship at 
Birkbeck College, University of London, while on leave from his position of Assistant 
Professor of Biostatistics at The Johns Hopkins University. 

David G. Gosslee, formerly with the University of Connecticut, has joined 
the Statistics Section of the Mathematics Panel at the Oak Ridge National Lab- 
oratory where he will consult with biologists. — 

John Gurland, formerly Professor of the Department of Statistics, Iowa State 
University, recently accepted a position as Professor of the Mathematics Research 
Center, U. S. Army, at the University of Wisconsin. 

Vincent Hodgson has completed this graduate study at the London School of 
Economics and Political Science and joined the faculty of The Department of 
Statistics, The Florida State University, Tallahassee, Florida. 

Mavrice G. Kendall, professor of statistics at the University of London and 
president of the Royal Statistical Society, has been appointed to the board of 
C-E-I-R (U.K.) Ltd. He will assume the new post of director for the London-based 
company’s mathematics, statistics and operations research departments, effective 
October 1, 1961 and will then vacate his chair at the university. 

Kugene Lukaes of the Catholic University of America will take a sabbatical 
leave during the academic year 1961/62. He will spend the greater part of this 
time at the Institut Statistique de l’ Université de Paris working under an Air Force 
Grant. From April 1962 to July 1962, he will be Visiting Professor at the Swiss 
Federal Institute of Technology. 


INTERNATIONALES SEMINAR 
liber 
biometrische Methoden in der Medizin und 
Genetik 


veranstaltet von der Schweizerisch-ésterreichischen 
Gruppe der Internationalen Biometrischen Gesellschaft 


vom 18. bis 22. September 1961 in Wien. Osterreich. 


Der Zweck des Seminars besteht darin, den Teilnehmern eine grundlegende 
und systematische Ausbildung in der biometrischen Behandlung medizinisch- 
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therapeutischer Probleme’ (Grundprinzipien des Planens und Auswertens cinsch- 
lagiger Versuche, spezielle Versuchsplane und ihre Begriindung, sequentielle Me- 
thoden) und Fragen der Genetik (Genfrequenzschatzung, Evolutions- und Muta- 
tionsfragen, Vererbung quantitativer Merkmale) zu vermitteln. Beide Themen- 
kreise sind sowohl fiir die Mediziner als auch die Genetiker aktuell und interessant. 
Im Sinne einer wirksamen Instruktionsveranstaltung sind taglich héchstens vier 
Vorlesungen vorgesehen. An mathematischen Grundkenntnissen wird nur der 
tibliche Mittelschullehrstoff vorausgesetzt. Das Lehrprogramm wird von ausge- 
wahlten Spezialisten betreut. Da eine begrenzte Teilnehmerzahl vorgesehen 
ist, ersuchen wir schon jetzt um Vormerkungen beim Ortlichen Tagungs- 
sekretariat: Institut fiir Statistik an der Universitat Wien, Wien I., Rathausstrafie 
19/11/3. 

Der Versand des detaillierten Programmes und des endgiiltigen Anmeldungs- 
formulares erfolgt im Sommer 1961. Teilnehmerbeitrag: sFr 30.—(bzw. 6. S 180.—, 
DM 30.—). 


Prof. Dr. A. Linder, Genf Prof. Dr. S. Sagoroff, Wien 
Prof. Dr. H. L. Le Roy, Ziirich Prof. Dr. L. Schmetterer, Wien 
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Utilisation de calculateurs ¢lectroniques dans l’analyse d’exp¢riences avec 
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Planification des exp¢riences d’alimentation des vaches laitiéres. .. H. L. Lucas 
La sélection artificielle suivant des caractéres quantitatifs..... H. lL. Le Roy 


La répartition des sexes chez des cucurbitacées hybrides et 1 leur relation 
avec une distribution contagieuse de Neyman...............-. I’. WEILING 


La puissance du critére F dans l’analyse de la variance de plans en bloes au 
hasard, nomogrammes pour le choix du nombre de répétitions. ...M. Keuns 


Développement des méthodes biométriques et statistiques dans la recherche 
agronomique au Congo Belge et au Ruanda-Urundi.......... J. M. HENRY 


Développement des méthodes biométriques ct statistiques dans la recherche 


L’emploi des méthodes biométriques dans |’étude et l’expérimentation des 
variétés de betterave sucriére en Belgique. ..... . . M. Simon N. Rousse. 


Les méthodes d’échantillonnage en sylviculture............. ANNE LENGER 


Radiosensibilité des graines d’andropogon issues de terrains uraniféres et non- 
uraniferes du Katanga... ... Dr. J. Mewissen, J. DamBion, Et Z. M. 


Dix années d’activités de la Biometric Society en Belgique, au Congo Belge 
et au Kuanda-Urundi 


Revue bibliographique. 


Tome II, N° 1 Mars, 1961 


Quelques problémes de numérations bactériennes................ H. Grimm 


Comparaison de deux méthodes d’analyse de données non orthogonales dans 
le cas d’expériences 4 deux facteurs. ..... P. Gitpert er B. GROSSMANN 
Analyse de données non orthogonales dans le cas d’une expériences & deux 
Revue bibliographique: D. J. Finney: experimental Design and its Statistical 
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Donatp F. Morrison 
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An Analysis of Some Relay Failure Data from a Composite 
Exnonentsal R. R. Prairre anp B. Oster 


Applications of Truncated Distributions in Process Startups 
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INTERNATIONAL JOURNAL OF ABSTRACTS 
STATISTICAL THEORY AND METHOD 


A Journal of the International Statistical Institute 


The aim of this journal of abstracts is to give complete coverage 
of published papers in the field of statistical theory (including 
associated aspects of probability and other mathematical methods) 
and new published contributions to statistical method. 


All contributions in the following five journals—being wholly 
devoted to this field—are abstracted: Annals of Mathematical 
Statistics; Biometrika; Journal, Royal Statistical Society (Series 
RB); Bulletin of Mathematical Statistics; Annals, Institute of Sta- 
tistical Mathematics; and a further group of six journals are ab- 
stracted on a virtually complete basis as follows: Biometrics; 
Metrika; Metron; Review, International Statistical. Institute; 
Technometrics; Sankhyé. There are about 250 other journals 
partly devoted to statistical theory and method from which the 
appropriate papers are abstracted. 


The abstracts are about 400 words long—the recommendation 
of UNESCO for the “long” abstract service: they are in the Eng- 
lish language although the original language of the paper is 
noted on the abstract together with the name of abstractor. In 
addition to the address of the author(s) are given in detail to 
facilitate contact in order to obtain further detail or request an 
off-print. The journal is published quarterly and contains ap- 
proximately 1000 abstracts per year. 


A scheme of classification has been developed for the abstracts 
that is flexible and facilitates the transfer of code numbers to 
punched cards. A unique aspect of this journal is that the pages 
are colour-tinted according to the main sections of classification. 
This method of colour-coding the pages provides a distinctive 
and powerful visual aid in the identification of abstracts in what- 
ever manner the journal is filed for reference. 


Annual Subscription £5 (U.S.A. and Canada $16.00) 
Single Number 30s (U.S.A. and Canada $4.50) 


OLIVER AND BOYD LTD. 
Tweeddale Court, 14 High Street, Edinburg, 1 
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INFORMATION FOR CONTRIBUTORS 
Manuscripts 


Contributions for Biometrics may be addressed to Dr. Ralph A. Bradley, De- 
partment of Statistics, The Florida State University, Tallahassee, Florida, US.A.; 
authors residing in the following Society Regions can expedite consideration of pa- 
pers by submitting them to the appropriate Associate Editor, namely; BRITISH 
REGION: Dr. 8. C. Pearce, East Malling Research Station, East Malling, Maid- 
stone, Kent, England; AUSTRALASIAN REGION: Dr. E. A. Cornish, Univer- 
sity of Adelaide, Adelaide, Australia; FRENCH REGION: Dr. Georges Teissier, 
Faculté des Sciences de Paris, 1 rue V. Cousin, Paris, France. QUERIES, NOTES, 
and related correspondence should be directed to Dr. D. J. Finney, Department of 
Statistics, University of Aberdeen, Meston Walk, Old Aberdeen, Scotland. Boc':s 
and material for Book Reviews should be sent to Mr. J. G. Skellam, The Nature 
Conservancy, 19 Belgrave Square; London, 8.W. 1, England. 

MANUSCRIPTS must be submitted in triplicate, with typescript doublespaced 
throughout. Marginal notes may obviate typographical difficulties presented for 
complicated formulae or tables—authors should not attempt editorial instruction- 
or markings for the printer. TABLES should be identified by arabic number and 
by a short descriptive title. ILLUSTRATIONS should also be identified by arabic 
number and by a brief caption. (Captions should not be included in illustrations, 
but should be typewritten collectively on an accompanying sheet.) Originals 
should be approximately 8.5 x 11 in. (21.5 x 28 cm.). The original of each chart, 
diagram, or graph should be executed in black on white drawing paper or board, 
on blue tracing linen, or on coordinate paper ruled in blue only; coordinate lines 
to be reproduced should be ruled in black. For printing, illustrations may be re- 
duced to % or % original dimensions. Lines should therefore be of sufficient thick- 
ness, and decimal points, periods, and stippled dots should be solid black circles 
large enough to reproduce well. Lettering and numerals should be at least 1 mm. 
high when reproduced in a cut 3 in. (7.5 cm.) wide. Photographs should be prints 
on glossy paper with strong contrasts, and if grouped in a plate should be mounted 
contiguously. All tables and illustrations should be mentioned explicitly in the 
text. REFERENCES (BIBLIOGRAPHIC) should be collectively listed alpha- 
betically by author; textual citation by author and year is preferred. 

ABSTRACTS 

Abstracts of papers presented at meetings of the Biometric Society or of its 
regions are printed in Biometrics following such meetings. They should be sub- 
mitted to the person designated to receive them for a particular meeting in exactly 
the form published in Biometrics (except for an Abstract Number), doublespaced 
- ym an and in duplicate. Use of formulae requiring display printing is to 
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Notices, ANNOUNCEMENTS, AND Biometric Society Reports 
International and regional reports and notices should be'stibmitted by the 
appropriate officers of the Society and its Regions in duplicate: doublespaced on 
separate sheets exactly as they are to be printed in Biometrics. Other material to 
be printed in News and Announcements should also be submitted doublespaced 
and in duplicate. 
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BACK ISSUES 


Back issues of Biometrics are available at the following postage-paid 
prices in U.S.A. currency: 


Price per Price per 
Year Volume Number Single Number Volume (unbound) 


1to6 
1 to 6 
lto4 
lto4 
lto4 
1lto4 
lto4 
lto4 
lto4 
lto4 
lto4 
lto4 
lto4 
1lto4 
1to4 
1 to 4 
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Reprints of individual articles are not available except to authors at the 
time of printing. Three special issues are among the numbers listed 
above. They are: 


1947 Volume 3 Number 1 The Analysis of Variance 
1951 Volume 7 Number 1 Components of Variance 
1957 Volume 13 Number 3 The Analysis of Covariance 


Also available are: 
Fishery Reprint Series (Selected reprints from Vol. 5) $1.00 
Subject Index (Volumes 1-10) 1.00 
Proceedings, International Biometric Symposium, 
Campinas, Brazil, 1955. ; 1.00 


Inquiries, non-member subscriptions, and orders for back issues and 
other material listed above should be addressed to: Blomerrics, DEPART- 
MENT OF Sratistics, THE Stats UNIvERsITy, TALLAHASSEE, 
Fiorina, U.S.A. 
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